Digital Forensic Science. Vassil Roussev
Чтение книги онлайн.
Читать онлайн книгу Digital Forensic Science - Vassil Roussev страница 11
• Tell story: The typical result of a forensic investigation is a final report and, perhaps, an oral presentation in court. The actual presentation may only contain the part of the story that is strongly supported by the digital evidence; weaker points may be established by drawing on evidence from other sources.
Top-down Processes
Top-down processes are analytical—they provide context and direction for the analysis of less structured data search and organization of the evidence. Partial, or tentative conclusions, are used to drive the search of supporting and contradictory pieces of evidence.
• Re-evaluate: Feedback from clients may necessitate re-evaluations, such as the collection of stronger evidence, or the pursuit of alternative theories.
• Search for support: A hypothesis may need more facts to be of interest and, ideally, would be tested against all (reasonably) possible alternative explanations.
• Search for evidence: Analysis of theories may require the re-evaluation of evidence to ascertain its significance/provenance, or may trigger the search for more/better evidence.
• Search for relations: Pieces of evidence in the file can suggest new searches for facts and relations on the data.
• Search for information: The feedback loop from any of the higher levels can ultimately cascade into a search for additional information; this may include new sources, or the reexamination of information that was filtered out during previous passes.
Foraging Loop
It has been observed [138] that analysts tend to start with a high-recall/low-selectivity query, which encompassed a fairly large set of documents—many more than the analyst can afford to read. The original set is then successively modified and narrowed down before the documents are read and analyzed.
The foraging loop is a balancing act between three kinds of processing that an analyst can perform—explore, enrich, and exploit. Exploration effectively expands the shoebox by including larger amounts of data; enrichment shrinks it by providing more specific queries that include fewer objects for consideration; exploitation is the careful reading and analysis of an artifact to extract facts and inferences. Each of these options has varying cost and potential rewards and, according to information foraging theory [141], analysts seek to optimize their cost/benefit trade-off.
Sense-making Loop
Sense-making is a cognitive term and, according to Klein’s [102] widely quoted definition, is the ability to make sense of an ambiguous situation. It is the process of creating situational awareness and understanding to support decision making under uncertainty; it involves the understanding of connections among people, places, and events in order to anticipate their trajectories and act effectively.
There are three main processes that are involved in the sense-making loop: problem structuring—the creation and exploration of hypotheses, evidentiary reasoning—the employment of evidence to support/disprove hypothesis, and decision making—selecting a course of action from a set of available alternatives.
Data Extraction vs. Analysis vs. Legal Interpretation
Considering the overall process from Figure 3.4, we gain a better understanding of the relationships among the different actors. At present, forensics researchers and tool developers primarily provide the means to extract data from the forensic targets (step 1), and the basic means to search and filter it. Although some data analytics and natural language processing methods (like entity extraction) are starting to appear in dedicated forensic software, these capabilities are still fairly rudimentary in terms of their ability to automate parts of the sense-making loop.
The role of the legal experts is to support the upper right corner of the process in terms of building/disproving legal theories. Thus, the investigator’s task can be described as the translation of highly specific technical facts into a higher-level representation and theory that explains them. The explanation is almost always tied to the sequence of actions of humans involved in the case.
In sum, investigators need not be software engineers but must have enough proficiency to understand the significance of the artifacts extracted from the data sources, and be able to competently read the relevant technical literature (peer-reviewed articles). Similarly, analysts must have a working understanding of the legal landscape and must be able to produce a competent report, and properly present their findings on the witness stand, if necessary.
1A more detailed definition and discussion of traditional forensics is beyond our scope.
CHAPTER 4
System Analysis
Modern computer systems in general use still follow the original von Neumann architecture [192], which models a computer system as consisting of three main functional units—CPU, main memory, and secondary storage—connected via data buses. This chapter explores the means by which forensic analysis is applied to these subsystems. To be precise, the actual investigative targets are the respective operating system modules controlling the different hardware subsystems.
System analysis is one of the cornerstones of digital forensics. The average user has very little understanding of what kind of information operating systems maintain about their activities, and frequently do not have the knowledge and/or privilege level to tamper with system records. In effect, this creates a “Locard world” where their actions leave a variety of traces that allow their actions to be tracked.
System analysis provides insight with a lot of leverage and high pay-offs in that, once an extraction, or analytical method, is developed, it can be directly applied to artifacts created by different applications.
4.1 STORAGE FORENSICS
Persistent storage in the form of hard disk drives (HDDs), solid state drives (SSDs), optical disks, external (USB-connected) media, etc., is the primary source of evidence for most digital forensic investigations. Although the importance of memory forensics in solving cases has grown tremendously, a thorough examination of persistent data has remained a critical component of almost all forensic investigations since the very beginning.
4.1.1 DATA ABSTRACTION LAYERS
Computer systems organize raw storage in successive layers of abstraction—each software layer (some may be in firmware) builds incrementally more abstract data representations dependent only on the interface provided by the layer immediately below it. Accordingly, forensic storage analysis can be performed at several levels of abstraction:
Physical media. At the lowest level, every storage device encodes a sequence of bits and it is, in principle, possible to use a custom mechanism to extract the data bit by bit. In practice, this is rarely done, as it is an expensive and time-consuming process. One example of this process are second-generation mobile phones for which it is feasible to physically remove (desolder) the memory chips and perform acquisition of the content [194]. Thus, the lowest level at which most practical examinations are performed is the host bus adapter (HBA) interface. Adapters implement a standard protocol (SATA, SCSI, etc.) through which they can be made to perform low-level operations. For damaged hard drives, it is often possible to perform