Python Digital Forensics Important Artifacts in Windows

Important Artifacts in Windows for Digital Forensics with Python

This chapter explains the various concepts involved in Microsoft Windows forensics and the important artifacts that investigators can obtain during an investigation.

Introduction

Artifacts are objects or areas in a computer system that contain important information related to the activities performed by the computer user. The type and location of this information depends on the operating system. During forensic analysis, these artifacts play a very important role in approving or disapproving an investigator’s observations.

The Importance of Windows Artifacts in Forensics

Windows artifacts are important for the following reasons

  • Approximately 90% of the world’s traffic comes from computers that use Windows as their operating system. This is why Windows artifacts are very important to digital forensic examiners.
  • The Windows operating system stores different types of evidence related to user activities on a computer system. This is another reason why Windows artifacts are important for digital forensics.

  • Many times, investigators focus their investigations on old and traditional areas, such as user-packed data. Windows artifacts can lead investigations into non-traditional areas, such as system-created data or artifacts.

  • Windows provides a wealth of artifacts, which can be helpful to investigators, as well as companies and individuals conducting informal investigations.

  • The increase in cybercrime in recent years is another reason why Windows artifacts are important.

Windows Artifacts and Their Python Scripts

In this section, we will discuss some Windows artifacts and Python scripts to obtain information from them.

Recycle Bin

This is one of the most important Windows artifacts in forensic investigations. The Windows Recycle Bin contains files that have been deleted by the user but have not yet been physically deleted by the system. Even if the user has completely deleted the file from the system, it can still be a valuable investigative source. This is because an investigator can extract valuable information from deleted files, such as the original file path and the time it was sent to the Recycle Bin.

Note that the storage method for Recycle Bin evidence varies depending on the Windows version. In the following Python script, we’re working with Windows 7, which creates two files: a $R file containing the actual contents of the recycled file; and a $I file containing the original file name, path, and size at the time of deletion.

For the Python script, we need to install the third-party modules: pytsk3, pyewf, and unicodecsv. We can install them using pip. We can extract information from the Recycle Bin by following these steps:

  • First, we need to use a recursive method to scan the $Recycle.bin folder and select all files that begin with $I.
  • Next, we will read the contents of these files and parse the available metadata structures.

  • Now, we will search for related $R files.

  • Finally, we will write the results to a CSV file for easy reference.

Let’s see how to achieve this using Python code –

First, we need to import the following Python libraries –

from __future__ import print_function
from argparse import ArgumentParser

import datetime
import os
import struct

from utility.py tskutil import TSKUtil
import unicodecsv as csv

Next, we need to provide arguments to the command line handler. Note that it accepts three arguments – the first is the path to the evidence file, the second is the type of evidence file, and the third is the desired output path for the CSV report, as shown below.

if __name__ == '__main__':

parser = argparse.ArgumentParser('Recycle Bin evidences')

parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file")

parser.add_argument('IMAGE_TYPE', help = "Evidence file format",

choices = ('ewf', 'raw'))

parser.add_argument('CSV_REPORT', help = "Path to CSV report")

args = parser.parse_args()

main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.CSV_REPORT)

Now, define main() function, which will handle all programs. It will search for $I files as shown below

def main(evidence, image_type, report_file):
   tsk_util = TSKUtil(evidence, image_type)
   dollar_i_files = tsk_util.recurse_files("<span class="katex math inline">I", path = '/</span>Recycle.bin',logic = "startswith")

   if dollar_i_files is not None:
      processed_files = process_dollar_i(tsk_util, dollar_i_files)
      write_csv(report_file,['file_path', 'file_size', 'deleted_time','dollar_i_file', 'dollar_r_file', 'is_directory'],processed_files)
   else:
      print("No $I files found")

Now, if we find the $I file, it must be sent to the process_dollar_i( ) function, which will accept the tsk_util object and the list of $I files, as shown below.

def process_dollar_i(tsk_util, dollar_i_files):
   processed_files = []

   for dollar_i in dollar_i_files:
      file_attribs = read_dollar_i(dollar_i[2])
      if file_attribs is None:
         continue
      file_attribs['dollar_i_file'] = os.path.join('/$Recycle.bin', dollar_i[1][1:])

Now, search for the $R file as shown below –

recycle_file_path = os.path.join('/<span class="katex math inline">Recycle.bin',dollar_i[1].rsplit("/", 1)[0][1:])
dollar_r_files = tsk_util.recurse_files(
   "</span>R" + dollar_i[0][2:],path = recycle_file_path, logic = "startswith")

   if dollar_r_files is None:
      dollar_r_dir = os.path.join(recycle_file_path,"$R" + dollar_i[0][2:])
      dollar_r_dirs = tsk_util.query_directory(dollar_r_dir)

   if dollar_r_dirs is None:
      file_attribs['dollar_r_file'] = "Not Found"
      file_attribs['is_directory'] = 'Unknown'

   else:
      file_attribs['dollar_r_file'] = dollar_r_dir
      file_attribs['is_directory'] = True

   else:
      dollar_r = [os.path.join(recycle_file_path, r[1][1:])for r in dollar_r_files]
file_attribs['dollar_r_file'] = ";".join(dollar_r)
file_attribs['is_directory'] = False
processed_files.append(file_attribs)
return processed_files

Now, define the read_dollar_i() method to read the $I file, in other words, parse the metadata. We will use the read_random() method to read the first eight bytes of the signature. This will return None if the signature does not match. Afterwards, if the $I file is a valid file, we will have to read and decompress the value from it.

def read_dollar_i(file_obj):
if file_obj.read_random(0, 8) != 'x01x00x00x00x00x00x00x00':
return None
raw_file_size = struct.unpack('<q', file_obj.read_random(8, 8))
raw_deleted_time = struct.unpack('<q', file_obj.read_random(16, 8))
raw_file_path = file_obj.read_random(24, 520)

Now, after extracting these files, we need to interpret the integers into human-readable numbers by using the sizeof_fmt( ) function, as shown below.

file_size = sizeof_fmt(raw_file_size[0])
deleted_time = parse_windows_filetime(raw_deleted_time[0])

file_path = raw_file_path.decode("utf16").strip("x00")
return {'file_size': file_size, 'file_path': file_path,'deleted_time': deleted_time}

Now, we need to define the sizeof_fmt() function as follows

def sizeof_fmt(num, suffix = 'B'):
   for unit in ['', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi']:
if abs(num) < 1024.0:
return "%3.1f%s%s" % (num, unit, suffix)
num /= 1024.0
return "%.1f%s%s" % (num, 'Yi', suffix)

Now, define a function that interprets an integer as a formatted date and time, as shown below.

def parse_windows_filetime(date_value):
microseconds = float(date_value) / 10
ts = datetime.datetime(1601, 1, 1) + datetime.timedelta(
microseconds = microseconds)
return ts.strftime('%Y-%m-%d %H:%M:%S.%f')

Now, we’ll define the write_csv() method to write the processed results to a CSV file, as shown below.

def write_csv(outfile, fieldnames, data):
with open(outfile, 'wb') as open_outfile:
csvfile = csv.DictWriter(open_outfile, fieldnames)
csvfile.writeheader()
csvfile.writerows(data)

When you run the above script, we’ll get data from both the I and R files.

Sticky Notes

Windows Sticky Notes replaced the habit of writing with pen and paper in the real world. These notes once floated on your desktop, with different colors, fonts, and other options. In Windows 7, Sticky Notes files are stored as OLE files, so in the following Python script, we will investigate this OLE file to extract metadata from the Sticky Notes.

For this Python script, we need to install the third-party modules: olefile, pytsk3, pyewf, and unicodecsv. We can install them using the pip command.

We can follow the steps discussed below to extract information from the StickyNote file, StickyNote.sn.

  • First, open the evidence file and locate all StickyNote.snt files.
  • Then, parse the metadata and content from the OLE stream and write the RTF content to a file.

  • Finally, create a CSV report of the metadata.

Python Code

Let’s see how to achieve this using Python code –

First, import the following Python libraries –

from __future__ import print_function
from argparse import ArgumentParser

import unicodecsv as csv
import os
import StringIO

from utility.pytskutil import TSKUtil
import olefile

Next, define a global variable that will be used throughout this script –

REPORT_COLS = ['note_id', 'created', 'modified', 'note_text', 'note_file']

Next, we need to provide arguments to the command line handler. Note that it accepts three arguments – the first is the path to the evidence file, the second is the type of evidence file, and the third is the desired output path, as shown below.

if __name__ == '__main__':

parser = argparse.ArgumentParser('Evidence from Sticky Notes')

parser.add_argument('EVIDENCE_FILE', help="Path to evidence file")

parser.add_argument('IMAGE_TYPE', help="Evidence file format",choices=('ewf', 'raw'))

parser.add_argument('REPORT_FOLDER', help="Path to report folder")

args = parser.parse_args()

main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.REPORT_FOLDER)

Now, we will define The main() function will be similar to the previous script, as shown below.

def main(evidence, image_type, report_folder):
tsk_util = TSKUtil(evidence, image_type)
note_files = tsk_util.recurse_files('StickyNotes.snt', '/Users', 'equals')

Now, let’s iterate over the resulting files. We will then call the parse_snt_file() function to process the file, and then we will use the write_note_rtf() method to write out the RTF file as shown below.

report_details = []
for note_file in note_files:
user_dir = note_file[1].split("/")[1]
file_like_obj = create_file_like_obj(note_file[2])
note_data = parse_snt_file(file_like_obj)

if note_data is None:
continue
write_note_rtf(note_data, os.path.join(report_folder, user_dir))
report_details += prep_note_report(note_data, REPORT_COLS,"/Users" + note_file[1])
write_csv(os.path.join(report_folder, 'sticky_notes.csv'), REPORT_COLS, report_details)

Next, we need to define the various functions used in this script.

First, we’ll define the create_file_like_obj() function, which takes a pytsk file object and reads the file size. Then, we’ll define the parse_snt_file() function, which accepts a file-like object as input and uses it to read and interpret the sticky notes file.

def parse_snt_file(snt_file):

   if not olefile.isOleFile(snt_file):
      print("This is not an OLE file")
      return None
   ole = olefile.OleFileIO(snt_file)
   note = {}

   for stream in ole.listdir():
      if stream[0].count("-") == 3:
         if stream[0] not in note:
            note[stream[0]] = {"created": ole.getctime(stream[0]),"modified": ole.getmtime(stream[0])}
         content = None
         if stream[1] == '0':
            content = ole.openstream(stream).read()
         elif stream[1] == '3':
            content = ole.openstream(stream).read().decode("utf-16")
         if content: note[stream[0]][stream[1]] = content
return note

Now, create an RTF file by defining the write_note_rtf( ) function as shown below

def write_note_rtf(note_data, report_folder):
if not os.path.exists(report_folder):
os.makedirs(report_folder)

for note_id, stream_data in note_data.items():
fname = os.path.join(report_folder, note_id + ".rtf")
with open(fname, 'w') as open_file:
open_file.write(stream_data['0'])

Now, we will translate the nested dictionaries into a flat list of dictionaries that is more suitable for a CSV spreadsheet. This will be done by defining the prep_note_report() function. Finally, we will define the write_csv() function.

def prep_note_report(note_data, report_cols, note_file):
   report_details = []

   for note_id, stream_data in note_data.items():
      report_details.append({
         "note_id": note_id,
         "created": stream_data['created'],
         "modified": stream_data['modified'],
         "note_text": stream_data['3'].strip("x00"),
         "note_file": note_file
      })
   return report_details
def write_csv(outfile, fieldnames, data):
   with open(outfile, 'wb') as open_outfile:
      csvfile = csv.DictWriter(open_outfile, fieldnames)
      csvfile.writeheader() csvfile.writerows(data)

After running the above script, we will have obtained metadata from the Sticky Notes file.

Registry File

The Windows Registry file contains many important details that are like a treasure trove of information for forensic analysts. It is a hierarchical database containing details related to operating system configuration, user activities, software installations, and more. In the following Python script, we will access common baseline information from the SYSTEM and SOFTWARE hives.

For this Python script, we need to install the third-party modules: pytsk3, pyewf, and registry. We can install them using pip.

We can follow the steps below to extract information from the Windows Registry.

  • First, locate the registry hive to be processed by name and path.
  • Then, we open these files using StringIO and the registry module.

  • Finally, we need to process each hive and print the parsed value to the console for interpretation.

Python Code

Let’s see how to achieve this using Python code

First, import the following Python libraries –

from __future__ import print_function
from argparse import ArgumentParser

import datetime
import StringIO
import struct

from utility.py tskutil import TSKUtil
from Registry import Registry

Now, provide the arguments to the command line handler. Here, it will accept two arguments – the first is the path to the evidence file, and the second is the type of evidence file, as shown below.

if __name__ == '__main__':

parser = argparse.ArgumentParser('Evidence from Windows Registry')

parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file")

parser.add_argument('IMAGE_TYPE', help = "Evidence file format",

choices = ('ewf', 'raw'))

args = parser.parse_args()

main(args.EVIDENCE_FILE, args.IMAGE_TYPE)

Now we will define the main() function to search the SYSTEM and SOFTWARE folders. file as shown below.

def main(evidence, image_type):
   tsk_util = TSKUtil(evidence, image_type)
   tsk_system_hive = tsk_util.recurse_files('system', '/Windows/system32/config', 'equals')
   tsk_software_hive = tsk_util.recurse_files('software', '/Windows/system32/config', 'equals')
   system_hive = open_file_as_reg(tsk_system_hive[0][2])
   software_hive = open_file_as_reg(tsk_software_hive[0][2])
   process_system_hive(system_hive)
   process_software_hive(software_hive)

Now, define the function that opens the registry file. To do this, we need to collect the file size from the pytsk metadata, as shown below.

def open_file_as_reg(reg_file):
file_size = reg_file.info.meta.size
file_content = reg_file.read_random(0, file_size)
file_like_obj = StringIO.StringIO(file_content)
return Registry.Registry(file_like_obj)

Now, with the help of the following method, we can process the **SYSTEM > **hive—-.

def process_system_hive(hive):
   root = hive.root()
   current_control_set = root.find_key("Select").value("Current").value()
   control_set = root.find_key("ControlSet{:03d}".format(current_control_set))
   raw_shutdown_time = struct.unpack(
      '<Q', control_set.find_key("Control").find_key("Windows").value("ShutdownTime").value())

   shutdown_time = parse_windows_filetime(raw_shutdown_time[0])
   print("Last Shutdown Time: {}".format(shutdown_time))

   time_zone = control_set.find_key("Control").find_key("TimeZoneInformation") .value("TimeZoneKeyName").value()

print("Machine Time Zone: {}".format(time_zone))
computer_name = control_set.find_key("Control").find_key("ComputerName").find_key("ComputerName")
.value("ComputerName").value()

print("Machine Name: {}".format(computer_name))
last_access = control_set.find_key("Control").find_key("FileSystem")
.value("NtfsDisableLastAccessUpdate").value()
last_access = "Disabled" if last_access == 1 else "enabled"
print("Last Access Updates: {}".format(last_access))

Now, we need to define a function that interprets an integer as a formatted date and time, as shown below.

def parse_windows_filetime(date_value):
microseconds = float(date_value) / 10
ts = datetime.datetime(1601, 1, 1) + datetime.timedelta(microseconds = microseconds)
return ts.strftime('%Y-%m-%d %H:%M:%S.%f')

def parse_unix_epoch(date_value):
ts = datetime.datetime.fromtimestamp(date_value)
return ts.strftime('%Y-%m-%d %H:%M:%S.%f')

Now, with the help of the following method, we can process the SOFTWARE Hive—-.

def process_software_hive(hive):
   root = hive.root()
   nt_curr_ver = root.find_key("Microsoft").find_key("Windows NT")
      .find_key("CurrentVersion")

   print("Product name: {}".format(nt_curr_ver.value("ProductName").value()))
   print("CSD Version: {}".format(nt_curr_ver.value("CSDVersion").value()))
   print("Current Build: {}".format(nt_curr_ver.value("CurrentBuild").value()))
   print("Registered Owner: {}".format(nt_curr_ver.value("RegisteredOwner").value()))
   print("Registered Org: {}".format(nt_curr_ver.value("RegisteredOrganization").value()))

raw_install_date = nt_curr_ver.value("InstallDate").value()
install_date = parse_unix_epoch(raw_install_date)
print("Installation Date: {}".format(install_date))

Running the above script will yield the metadata stored in the Windows registry file.

Leave a Reply

Your email address will not be published. Required fields are marked *