Digital Forensic Science. Vassil Roussev

Чтение книги онлайн.

Читать онлайн книгу Digital Forensic Science - Vassil Roussev страница 8

Digital Forensic Science - Vassil Roussev Synthesis Lectures on Information Security, Privacy, and Trust

Скачать книгу

to the field—seeks to offer a more detailed and formal description of the process. The model employs finite state machines to capture the state of the system, as well as its (algorithmic) reaction to outside events. One of the main considerations is the development of an investigative model that avoids human bias by focusing on modeling the computation itself along with strict scientific hypothesis testing. The investigation is defined as a series of yes/no questions (predicates) that are evaluated with respect to the available history of the computation.

       Primitive Computer History Model

      This assumes that the computer being investigated can be represented as a finite state machine (FSM), which transitions from one state to another in reaction to events. Formally, the FSM is a quintuple (Q, Σ, δ, s0, F), where Q is a finite set of states, Σ is a finite alphabet of event symbols, δ is the transition function δ : Q × Σ → Q, s0 ∊ Q is the starting state of the machine, and FQ is the set of final states.

      The primitive history of a system describes the lowest-level state transitions (such as the execution of individual instructions), and consists of the sequence of primitive states and events that occurred.

      The primitive state of a system is defined by the discrete values of its primitive, uniquely addressable storage locations. These may include anything from a CPU register to the content of network traffic (which is treated as temporary storage). As an illustration, Figure 3.1 shows an event E1 reading the values from storage locations R3 and R6 and writing to locations R3 and R4.

image

      Figure 3.1: Primitive computer history model example: event E1 is reading the values from storage locations R3 and R6 and writing to locations R3 and R4 [27].

      The primitive history is the set T containing the times for which the system has a history. The duration between each time in T, Δt, must be shorter than the fastest state change in the system. The primitive state history is function hps : TQ that maps a time tT to the primitive state that existed at that time. The primitive event history is a function hpe : T → Σ that maps a time tT to a primitive event in the period (t – Δt, t + Δt).

      The model described so far is capable of describing a static computer system; in practice, this is insufficient as a modern computing system is dynamic—it can add resources (such as storage) and capabilities (code) on the fly. Therefore, the computer history model uses a dynamic FSM model with sets and functions to represent the changing system capabilities. Formally, each of the Q, Σ, and δ sets and functions can change for each tT.

       Complex Computer History Model

      The primitive model presented is rarely practical on contemporary computer systems executing billions of instructions per second (code reverse engineering would be an exceptional case). Also, there is a mismatch between the level of abstraction of the representation and that of the questions that an investigator would want to ask (e.g., was this file downloaded?). Therefore, the model provides the means to aggregate the state of the system and ask questions at the appropriate level of abstraction.

      Complex events are state transitions that cause one or more lower-level complex or primitive events to occur; for example, copying a file triggers a large number of primitive events. Complex storage locations are virtual storage locations created by software; these are the ephemeral and persistent data structures used by software during normal execution. For example, a file is a complex storage location and the name value attribute pairs include the file name, several different times-tamps, permissions, and content.

      Figure 3.2 shows a complex event E1 reading from complex storage locations D1 and D2 and writing a value to D1. At a lower level, E1 is performed using events E1a and E1b, such as CPU, or I/O instructions. The contents of D1 and D2 are stored in locations (D1a, D1b) and (D2a, D2b), respectively.

Image

      Figure 3.2: Complex history event examples: event E1 with two complex cause locations and one complex effect location [27].

       General Investigation Process

      The sequence of queries pursued by the investigator will depend on the specific objectives of the inquiry, as well as the experience and training of the person performing it. The CHM is agnostic with respect to the overall process followed (we will discuss the cognitive perspective in Section 3.3.3) and does not assume a specific sequence of high-level phases. It does, however, postulate that the inquiry follow the general scientific method, which typically consists of four phases: Observation, Hypothesis Formulation, Prediction, and Testing & Searching.

      Observation includes the running of appropriate tools to capture and observe aspects of the state of the system that are of interest, such as listing of files/processes, and rendering the content of files. During Hypothesis Formulation the investigators use the observed data, and combine it with their domain knowledge to formulate hypothesis that can be tested, and potentially falsified, in the history model. In the Prediction phase, the analyst identifies specific evidence that would be consistent, or would be in contradiction, with the hypothesis. Based on the predictions, experiments are performed in the Testing phase, and the outcomes are used to guide further iterations of the process.

       Categories of Forensic Analysis

      Based on the outlined framework, the CHM identifies seven categories of analytical techniques.

      History duration. The sole techniques in this category and is operational reconstruction—it uses event reconstruction and temporal data from the storage devices to determine when events occurred and at what points in time the system was active. Primary sources for this analysis include log files, as well as the variety of timestamp attributes kept by the operating system and applications.

      Primitive storage system configuration. The techniques in this category define the capabilities of the primitive storage system. These include the names of the storage devices, the number of addresses for each storage device, the domain of each address on each storage device, and when each storage device was connected. Together, these sets and functions define the set of possible states Q of the FSM.

      Primitive event system configuration. Methods in this category define the capabilities of the primitive event system; that is, define the names of the event devices connected, the event symbols for each event device, the state change function for each event device, and when each event device was connected. Together, these sets and functions define the set of event symbols Σ and state change function δ. Since primitive events are almost never of direct interest to an investigation, these techniques are not generally performed.

      Primitive state and event definition. Methods in this category define the primitive state history (hps) and event history (hes) functions. There are five types of techniques that can be used to formulate and test this type of hypothesis and each class has a directional

Скачать книгу