The expansion of computer science has revealed us to the increasing number of security threats and risks. In this paper, we are discussing a Taint Analysis technique that can disclose several common attacks in web applications, and has drawn much attention from the research community and industry. Taint analysis is a form of information flow analysis, sees if values from un-trusted users, methods, and parameters may stream into security delicate activities (Omer Tripp, Macro Pistoia 2009). This scans the variables that can be altered by the user inputs. All user inputs may be risky if they are not correctly verified. This paper focuses on different taint analysis techniques that are useful to detect malicious attacks in software.
Omer Tripp, Macro Pistoia has planned and executed a static taint analysis for java (TAJ) that satisfies the demands of industry-level applications. TAJ can examine applications of practically any size, as it utilizes a set of approaches intended to produce helpful solutions given a limited schedule. Information-flow violations involve the major important security weaknesses in present web applications. Actually, as per the Open Web Application Security Project (OWASP), they comprise the best six security issues. Automatically recognizing such weaknesses in real-time web applications might be difficult due to their size and intricacy. Manual code review is frequently ineffectual for such complex projects and security testing may stay uncertain due to insufficient coverage. Omer Tripp, Macro Pistoia suggests a static analysis solution that identifies four of the previously mentioned top six security weaknesses.
Injection flaws are another vulnerability that arises when a user accepts their input through the web application and sends input to the interpreter for query or command without validating this input. Through this threat, an attacker may trick the interpreter to execute the command or unintended data changing. This kind of attack is called "structured query language injection"(SQLi).
Malicious-file executions are another threat when a user trusts input data improperly or uses an unverified file in several functions which come through the web application. It means a user gives access to allowing the content of hostile for executing programs on their server.
To locate these threats or vulnerabilities, the community of research has focused more attention on static analysis to check the security of the information flow of all web applications.
During the taint analysis process, this taint analysis does not check where this data comes from and is it a trustable or malicious file. This taint analysis takes all data f which comes from trusted or un-trusted sites and marked this data as taint that this analysis traced this data for checking and found the sink point from the imputed data such as API which is string data and this analysis converts this data string into an executable program (Alashjaee et al. 2019). The number of symbolic executions has a feasible path for growing up their program exponentially with program size that leads to an explanation path. Execution of the degrading symbol has environmental interactions. Programs connect with this environment through performing receiving signals, system calls, and so on. The problem of consistency can arise when the execution process reaches its components which are not controllable by the tools of symbolic execution (Boxler and Walcott, 2018). The execution of consoles is the hybrid execution program that provides symbolic execution with the execution of a concrete path. Therefore, began this program with specific concrete input, the execution of consoles executes symbolically with a specific program, gathering constraints input from statements conditional which encountered with a specific path. The consistency of the proposed framework for three kinds of modules, such as dynamic analysis, black-box analysis, and static analysis. Also, there is one integrated framework for taint analysis for static analysis such as parfait. Also, there are several execution tools in the taint analysis such as KLEE, virtual machine, Mayhem, S2E, AEG. This KLEE tool is used to analyse several programs, and generate automatically input sets into the systems which achieve coverage high levels code (Clause et al. 2017). The MAYHEM tool is used for identifying exploitable bugs automatically into the binary program in a scalable and efficient way. S2E is an execution tool that is based on a platform to analyse symbolic execution behaviour and properties for system software. AEG is the END-to-end execution tool to identify vulnerability exploits automatically. The technique of symbolic execution for novel precondition and the algorithms of path prioritization for finding exploitable bugs in web applications.
The web application is the vital communication system between several kinds of clients, service providers through the internet. Web applications highlighted several negative vulnerabilities such as security flaws, cross-site scripting, SQL injection, and many more things (Dai et al. 2018). To overcome this situation this taint analysis provides two different analyses such as static analysis and dynamic analysis for prevention in web applications.
Static analysis is a technique that is used for over-approximation detection of instruction sets that are inspired through user input. This taint instruction set is implemented and performed statically by program source analysis. The main advantage of using static analysis is that it takes all accounts for possible path execution of a specific program (Ferrara et al. 2019). There are several effective static analysis models for web applications, the model of TAJ calls reflectively, flow taint through containers, taint detection on the internal objects state, pages of java server, JavaBeans enterprise, the spring and struts frameworks, and other essential features are ignored in the literature review but these features are more effective on the web applications. Basic static analysis models program and the code of the library are available for direct use (Galea and Kroening, 2020). To improve the precision and performance, it is tuned for analysis with several high-class models. In static analysis, there is one framework such as parfait which is a multi-layered static program. This framework is used in pre-processing stages. This framework helps to reduce reach ability graph problems. These such models are used for a general-purpose which can be effective for static analysis such as, code-reduction models, approximating web frameworks behaviour, native methods, and reflection APIs, in the code-reduction model, this model is used to optimize the program and exclude libraries from benign packages, classes, and subclasses which based on the generated white list (Luo et al. 2019). In the web frameworks, these kinds of frameworks need precise analysis to gain information about the configuration files. Also, these frameworks should implement a model view pattern, where these specific controllers are configured with an "eXtensible mark-up language" (XML) file. In native methods and Reflection APIs, taint analysis includes several significant types of machinery for checking the behaviour of the APIs of java reflection such as method. Invoke and class. Invoke. This reflection API may be inferred by the argument value, the machine synthesizes a specific essential abstraction in a place through a reflective call (Paduraru et al. 2019). Also, it relies on the system with the synthetic model hand-coded for several native methods in java libraries. This method is vital because it is not enough to control information and tracking data but it also requires calling native methods to figure out prominently several operations that are related to security.
The dynamic analysis is used to mark original data which comes from an un-trusted web application. This analysis tracks all taint data which stores in memory because this data can be used in bad situations. Also, this analysis can detect all possible pugs. This analysis approaches several capabilities for detecting input vulnerabilities validation with low rates of false-positive vulnerabilities. Dynamic analysis is required for carrying out data from a server. Therefore, this dynamic analysis technique gives an accurate picture of specific web applications for analysis (Pauck et al. 2018). Also, this dynamic analysis technique provides higher positive fault results, for this reason, is better than the testing of the Black box technique. Vulnerability detection of the web application is done by a combination of dynamic and static analysis. Also, detect vulnerabilities of the web application by using the testing module of penetration then this result is used as input into the model of dynamic analysis. The combination of dynamic and static analysis is used to prevent the scripting of cross-site (Saad et al. 2018). Also, the combination of the dynamic and static analysis creates a framework for preventing cross-site scripting such as SDCF. in the dynamic analysis, there are several tools which are used on web application for vulnerability detection, data-leak detection, forensics and malware detection such as, brainteaser, Information flow, Lift, cloud fence. Taint eraser is used to prevent vital and essential data leaks. This tool implements taint propagation into the kernel to reduce track binary with
Also, this tool helps to enable loading web applications fast with semi-aware instruction and increased accuracy. An information flow tool is used to prevent data by using leak detection techniques. Lift is another dynamic analysis tool for detecting vulnerability with help from information flow tools. This tool can sense and target particular exploits of vulnerability such as buffer overflow, worm, format string. Also, it helps to exploit the instrumentation of dynamic binary and optimize several security attacks. Cloud fence is another service model of data-flow tracking which can monitor data leaks on all kinds of cloud services. Also, this cloud fence tool supports byte-level tagging data and it also uses a dynamic binary pin translator (Staicu et al. 2020). Also, there are few disadvantages when using dynamic analysis. The program execution is much slower than other analyses because this analysis checks additional necessary files and this analysis program is helpful to detect when path execution has been executed. This execution cannot stop until this path leads to false negatives.
Vulnerability in software is defined with flaw to the system software which gives a computer system or software to a procedure or crashes invalid or irrelevant output or behaves software unintended way. Detecting vulnerability is the confirmation process when a specific system stores flaws which might be leveraged through an attacker or hacker to compromise whole system security or the platform where this system is running status (Von Maltitz et al. 2017). In the software vulnerability detection, there are several security approaches such as vulnerability focus detection for identifying threats and correcting flaws, instruction prevention, and detection, rather than blocking and detecting several kinds of attack which exploit flaws. These are two vulnerability detection for software such as course code level and binary code level.
Vulnerabilities of software are defined by software bugs that are used on security implications; vulnerabilities may be found as software subset defects. Vulnerability detection can be found in source code. At the level of source code level, this tool uses approaches to code detection to find similarities on token-based, the graph of control flow, and the syntax of the abstract tree. For shows the dependency graph. These analysis tools of source code are referred to as "static application security testing" tools. Some source code tools are moving towards IDE because all source code is implemented from IDE. Also, these tools may be run on several kinds of software and this tool can run again when specific software runs again (Wang et al. 2018). These source code tools identify source code vulnerabilities automatically with good accuracy such as the flaws of SQL injection, butter overflow, and so on. This tool helps to provide good output to a developer. It can highlight affected source code, subsections, and line numbers. But this source code tool provides a huge number of positive mistakes also it cannot find issues of configuration. When this program analyses a source code then it cannot be able to compile because after compilation this specific source code does not have any libraries.