Improper Input Validation – A Never-Ending Concern

Have you ever doubted your cybersecurity practice? Regardless of the role you fulfill in IT, you are bound to have a certain level of awareness in terms of risk and potential consequences involved in your line of work.

If you’ve been reading our blog, we’re certain that you've recognized the pattern that spreads among the majority of vulnerabilities that we’ve covered. Recently we’ve had these types of flaws in CVE-2022-27926, CVE-2017-7494, and CVE-2022-21587, which all have a more-or-less common cause.

Of course, we’re talking about input validation.

Whether you call it validation, sanitization, or neutralization, it is a very common mistake made in the software development process. Whether you’re a junior or a senior programmer - you’ve most likely been guilty of it on numerous occasions, and you might not even be aware of it.

Having that said, we’re about to disassemble this repeating issue systematically and try to find a flaw in the development process that’s the root cause of it.

Dare We Call It… A Cultural Mistake?

As software development methodologies advance, new IT roles arise literally every year, revealing a brand new path full of pitfalls. Regardless of that, in well-organized teams, the jurisdiction over different parts of the development process is at this point very well-defined. But even large enterprises that employ some of the best tactics to avoid security issues in their software end up being vulnerable or even leaving their customers exposed to risk.

This time, we might be able to blame it on the system. While most flaws are caused by outdated libraries, misconfigured servers, and network vulnerabilities, they also occur when a developer fails to properly validate, sanitize or neutralize user input, leaving the application open to attacks via specifically crafted strings, queries, and files.

The mentality of a developer often focuses on building functionality and delivering features quickly, whereas the mentality of a QA professional is more focused on testing the application for defects and ensuring it meets user requirements. While a developer might overlook input validation vulnerabilities, either because they are unaware of the potential risks, or because they prioritize other tasks over security due to stress and deadlines, a QA professional on the other hand may have a security-focused mentality but may lack programming experience to identify input validation vulnerabilities.

This can lead to vulnerabilities slipping through the cracks, as the average QA engineer is unlikely to be aware of the specific coding practices that are required to prevent input validation vulnerabilities, and therefore may not be able to design the tests adequately to detect these sorts of flaws.

What Does It Take To Be A Hero?

It takes at least a few solid years of software development experience with an accent on security to be able to foresee these problems. Being able to think outside the box is also a must in order to do this job correctly.

Normally, security experts are responsible for verifying the safety of the code. The fact of the matter is that in IT there’s currently a shortage of personnel with the appropriate skill set for this job, so even the big companies struggle to employ enough security experts.

For this time-consuming and resource-intensive process, in some cases, we would need up to one security expert for every 15 devs in order to verify every line of code, depending on the complexity and scope of the software project. In reality, surveys show that there’s only one security expert for every 100 devs.

Although there isn’t much we can do about those numbers, we can try adapting to the situation by prioritizing security in the software development and testing lifecycle, ensuring that all team members receive appropriate training and guidance on secure coding practices. This can help to prevent input validation vulnerabilities and other security flaws from being introduced into the code and improve the overall security of the application.

Wrong Priorities

We’ve seen some employers advertising their vacant QA positions using the slogan “Can’t code, but you want a developer salary?”. Well, that’s not the reality.

According to multiple sources, in the US the annual salary range for a software developer (depending on the level of seniority) is anywhere between $70.000 and $120.000, while the most experienced QA engineers make at best up to $70.000 on average. We could make an argument saying that the lack of security engineers can be substituted by testers who are more thoroughly trained, and that would be a very logical one. But when you look at most job postings for QA-related positions, they rarely require any programming experience, which is a must if you wish to have an ace hunting down these sneaky flaws in your team.

QA is often seen as a good entry-level position for people who are just starting to build their careers in IT, and that is true to some extent. The problem is the upper threshold because you can’t expect an expert to willingly work for a lower wage, especially if he’s to take responsibility that would theoretically fall under another role. It’s not uncommon to see senior QA engineers trying to “switch lanes” in their careers because of the limited financial opportunities, and it’s a shame since there are many more levels to their craft.

We see this as a reality check that calls for a global mindset change.

Prevention And Detection

Preventing and detecting improper input validation requires a combination of careful planning, implementation, and testing. Several techniques can be applied, such as using automated tools to scan the codebase for known vulnerabilities, conducting manual code reviews to identify potential flaws, and implementing input validation libraries or frameworks that are specifically designed to catch common errors.

There are 2 types of input validation:

Syntactic validation checks whether the input conforms to the expected format or syntax, such as checking for the correct number of characters, valid characters, and correct data type. This type of validation ensures that the input meets the basic requirements and structure for further processing.
Semantic validation, on the other hand, checks the meaning and context of the input to ensure that it is reasonable and appropriate for the given situation. For example, if a form asks for a user's age, semantic validation checks that the entered value is a reasonable age and not a negative number. Another example would be limiting the end date field on a job application form for entering work experiences in a way that it cannot be set before the start date or the other way around.

Improper input validation vulnerabilities are more likely to be exploited through web applications and services, as they are more exposed and rely heavily on user input that can be manipulated to execute malicious code on the server side, so we’re gonna focus on the following 3 technologies.

PHP

PHP is a favored language for web development that is commonly used to construct dynamic websites and web applications. Nevertheless, PHP-based applications are prone to inadequate input validation, which is a common issue that leads to various kinds of attacks, such as SQL injection and XSS.

Here are some tips for properly validating, sanitizing, and neutralizing user input in PHP:

Use PHP's built-in validation functions:
PHP has built-in functions for validating common input types, such as filter_var() for filtering and validating email addresses, URLs, and IP addresses. These functions are designed to handle the nuances and edge cases of validating these types of input, so it's a good idea to use them instead of writing your own validation code.

Use prepared statements for database queries:
Prepared statements are a powerful tool for preventing SQL injection attacks. They allow you to write SQL queries with placeholders for user input, which are then filled in with sanitized user input at runtime. Prepared statements can be used with most PHP database extensions, including MySQLi and PDO.

Use HTML Purifier to sanitize HTML input:
HTML input can be dangerous, as it can contain scripts or other malicious code. HTML Purifier is a PHP library that can be used to sanitize HTML input, removing any potentially malicious code and leaving only safe, well-formed HTML.

This is the most basic login form example:

<?php
$username = $_POST['username'];
$password = $_POST['password'];

if ($username == 'admin' && $password == 'password') {
    // grant access
} else {
    // deny access
}
?>

This code is vulnerable to SQL injection and XSS (cross-site scripting) attacks.

These are just a few benign examples of what an attacker could submit in our username and password fields that would confuse our PHP form and lead to unwanted behavior:

Username: ' OR 1=1 --
Password: anything

Username: <script>alert('XSS');</script>
Password: anything

By updating the above PHP code, we get something like this:

<?php
// Use PHP's built-in validation functions to check that the input is not empty and that it is a string
if (!empty($_POST['username']) && is_string($_POST['username'])) {
    $username = $_POST['username'];
} else {
    // handle invalid input
    die("Invalid input for username.");
}

if (!empty($_POST['password']) && is_string($_POST['password'])) {
    $password = $_POST['password'];
} else {
    // handle invalid input
    die("Invalid input for password.");
}

// Use prepared statements for database queries to prevent SQL injection attacks
$stmt = $pdo->prepare('SELECT * FROM users WHERE username = ? AND password = ?');
$stmt->execute([$username, $password]);
$user = $stmt->fetch();

if ($user) {
    // Use HTML Purifier to sanitize any HTML input
    $purifier = new HTMLPurifier();
    $safe_html = $purifier->purify($_POST['html_input']);
   
    // grant access
} else {
    // deny access
}
?>

This is the same login form with the above-recommended security measures in place.

Java

Java is a well-known web development language used extensively to develop enterprise-level web applications. Despite this, Java applications can also be vulnerable to input validation problems that lead to various types of attacks, including SQL injection, command injection, and directory traversal.

This is a Java program that prompts the user to enter a file name and then reads the contents of the specified file and outputs them to the console.

import java.io.File;
import java.io.IOException;
import java.util.Scanner;

public class FileReadExample {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        System.out.print("Enter file name: ");
        String fileName = scanner.nextLine();
        File file = new File(fileName);
        try {
            Scanner fileScanner = new Scanner(file);
            while (fileScanner.hasNextLine()) {
                String line = fileScanner.nextLine();
                System.out.println(line);
            }
            fileScanner.close();
        } catch (IOException e) {
            System.out.println("File not found.");
        }
    }
}

In this code, the user is prompted to enter a file name, and the program attempts to read the contents of the file. However, there is no input validation to ensure that the user is only entering a valid file name.

This can lead to a serious security vulnerability known as a path traversal attack, where a malicious user can input a specially crafted file name that allows them to access files outside of the intended directory. Here's an example of a malicious input that could exploit this vulnerability:

Enter file name: ../../../../etc/passwd

This input refers to a file located outside of the intended directory and could potentially allow the attacker to read sensitive system files such as /etc/passwd.

To prevent this input validation flaw, we can add a validation code that ensures the input is only a valid file name within the intended directory. Here is the example of the above code, but where the input is validated:

import java.io.File;
import java.io.IOException;
import java.util.Scanner;

public class FileReadExample {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        String fileName = "";
        while (fileName.isEmpty() || !new File(fileName).isFile() || !fileName.startsWith("data/")) {
            System.out.print("Enter file name (must be a file in the data/ directory): ");
            fileName = scanner.nextLine();
            if (fileName.isEmpty()) {
                System.out.println("File name cannot be empty. Please try again.");
            } else if (!new File(fileName).isFile()) {
                System.out.println("File not found. Please try again.");
            } else if (!fileName.startsWith("data/")) {
                System.out.println("File must be in the data/ directory. Please try again.");
            }
        }
        File file = new File(fileName);
        try {
            Scanner fileScanner = new Scanner(file);
            while (fileScanner.hasNextLine()) {
                String line = fileScanner.nextLine();
                System.out.println(line);
            }
            fileScanner.close();
        } catch (IOException e) {
            System.out.println("File not found.");
        }
    }
}

Here are the steps we undertook in order to make it secure:

To prevent path traversal attacks, we validated the user input to ensure that it is a valid file name within the intended directory. The validation includes checking that the file exists, is a file (not a directory), and starts with the data/ prefix. If the input is invalid, we print an error message and prompt the user to try again.

We use a while loop to continuously prompt the user to enter a valid file name within the data/ directory until a valid input is provided. This helps to prevent the user from entering a malicious input that could lead to a path traversal attack.

Limited access to files:
By restricting the file access to only files within the data/ directory, we limit the potential damage that a path traversal attack could cause. Even if an attacker is able to bypass the input validation and access files outside of the directory, they will only be able to access files that are not critical to the system's operation.

Proper exception handling:
In the event that an error occurs during file access, we use proper exception handling to handle the error and prevent sensitive system information from being leaked to the user. We catch the IOException and print a generic error message instead of the system error message, which could potentially leak sensitive information.

By implementing these security tactics, we are able to prevent improper input validation vulnerabilities and ensure that the user can only access files within the intended directory, limiting the potential damage that a path traversal attack could cause.

Node.js

Node.js is a prevalent JavaScript runtime environment that enables developers to create scalable web applications. However, input validation problems are prevalent as well in Node.js-based applications, resulting in various types of attacks, such as XSS and command injection.

Here are some tactics you can implement to prevent input validation vulnerabilities in Node.js:

Use a validation library: There are several popular Node.js libraries available, such as Joi and Express-validator, that can be used to validate user input. These libraries provide a range of validation options, including data type validation, length validation, and regular expression validation.

Sanitize user input: Sanitizing user input involves removing any potentially dangerous characters or scripts that could be used to exploit vulnerabilities in your application. Libraries like DOMPurify can be used to sanitize user input and prevent cross-site scripting (XSS) attacks.

Avoid using eval() function: The eval() function in Node.js can be exploited to execute arbitrary code and is a common target for command injection attacks. Avoid using the eval() function altogether and instead use alternatives like JSON.parse() or Function().

Use parameterized queries: When interacting with a database in your Node.js application, it is important to use parameterized queries to prevent SQL injection attacks. Parameterized queries separate user input from the SQL code and sanitize the input to prevent malicious code injection.

Use security-focused HTTP headers: Setting security-focused HTTP headers like X-XSS-Protection, X-Content-Type-Options, and Content-Security-Policy can help prevent various types of attacks, including XSS and clickjacking.

By implementing these tactics, you can prevent improper input validation vulnerabilities and ensure the security of your Node.js-based application.

Here’s a sample code that doesn’t include any of these measures:

const express = require('express');
const app = express();

app.get('/search', (req, res) => {
  const query = req.query.q;
  const sql = `SELECT * FROM products WHERE name = '${query}'`;
  db.query(sql, (err, result) => {
    if (err) throw err;
    res.send(result);
  });
});

app.listen(3000, () => {
  console.log('Server running on port 3000');
});

In this code, we are using user input (in the form of a query parameter) directly in a SQL query without any validation or sanitization. This can lead to a SQL injection attack, where an attacker can manipulate the input to execute arbitrary SQL code and potentially gain access to sensitive information.

Here are a couple of examples of malicious input that could cause serious harm:

'; DROP TABLE products; -

' UNION SELECT * FROM users WHERE 1 = 1; --

To prevent these vulnerabilities, we can implement the tactics mentioned earlier. Here's an updated version of the code with these measures in place:

const express = require('express');
const app = express();
const Joi = require('joi');
const sqlstring = require('sqlstring');

const productSchema = Joi.object({
  name: Joi.string().required(),
  price: Joi.number().min(0).required()
});

app.get('/search', (req, res) => {
  const query = req.query.q;
  const validatedQuery = sqlstring.escape(query);
  const sql = `SELECT * FROM products WHERE name = ${validatedQuery}`;
  db.query(sql, (err, result) => {
    if (err) throw err;
    res.send(result);
  });
});

app.post('/products', (req, res) => {
  const { error } = productSchema.validate(req.body);
  if (error) {
    return res.status(400).send(error.details[0].message);
  }
  const sql = `INSERT INTO products (name, price) VALUES ('${req.body.name}', '${req.body.price}')`;
  db.query(sql, (err, result) => {
    if (err) throw err;
    res.send(result);
  });
});

app.listen(3000, () => {
  console.log('Server running on port 3000');
});

In this updated code, we are using the Joi library to validate the user input for the POST /products route. We are also using the sqlstring library to escape any potentially harmful characters in the user input for the GET /search route, preventing SQL injection attacks.

Conclusion

Though we did cover some basic cases of improper input validation vulnerabilities, there are many more scenarios in which this flaw can occur. You must be able to see the architecture of the software you’re developing from a bigger perspective to be able to define a rule against every possible scenario.

The advice that we’ve given above is a good starting point to take on a new, more security-oriented approach in programming, but security is something that comes with knowledge and years of experience, and you shouldn't rely solely on the measures recommended above.

At this point, we’d like to use an opportunity to invite you to subscribe to our blog, because recently we’ve started scanning the internet for these vulnerabilities, which we are yet to cover.

Subscribe to be updated on the new content!

DataGridSurface