Digital Forensics with Python: Important Artifacts in Windows – II

This chapter describes some of the more important artifacts in Windows and how to extract them using Python.

User Activity

Windows uses the NTUSER.DAT file to store various user activities. Each user profile has a hive like NTUSER.DAT that specifically stores information and configuration related to that user. Therefore, it is very useful for forensic analysts in their investigations.

The following Python script will parse some keys in NTUSER.DAT to explore the user’s behavior on the system. Before proceeding further, for the Python script, we need to install the third-party modules: Registry, pytsk3, pyewf, and Jinja2. We can install these using pip.

We can extract information from the NTUSER.DAT file by following these steps:

First, we search the system for all NTUSER.DAT files.
Then, we parse each NTUSER.DAT file for its WordWheelQuery, TypePath, and RunMRU keys.
Finally, we write these processed artifacts to an HTML report using the Jinja2 fmodule.

Python Code

Let’s see how to achieve this using Python code –

First, we need to import the following Python modules –

from __future__ import print_function
from argparse import ArgumentParser

import os
import StringIO
import struct

from utility.pytskutil import TSKUtil
from Registry import Registry
import jinja2

Now, provide arguments to the command line handler. It will accept three arguments – the first is the path to the evidence file, the second is the type of evidence file, and the third is the output path for the desired HTML report, as shown below.

if __name__ == '__main__':
parser = argparse.ArgumentParser('Information from user activities')
parser.add_argument('EVIDENCE_FILE',help = "Path to evidence file")
parser.add_argument('IMAGE_TYPE',help = "Evidence file format",choices = ('ewf', 'raw'))
parser.add_argument('REPORT',help = "Path to report file")
args = parser.parse_args()
main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.REPORT)

Now, let’s define the main() function to search for all NTUSER.DAT files, as shown.

def main(evidence, image_type, report):
tsk_util = TSKUtil(evidence, image_type)
tsk_ntuser_hives = tsk_util.recurse_files('ntuser.dat','/Users', 'equals')

nt_rec = {
'wordwheel': {'data': [], 'title': 'WordWheel Query'},
'typed_path': {'data': [], 'title': 'Typed Paths'},
'run_mru': {'data': [], 'title': 'Run MRU'}
}

Now, we will try to find the key in the NTUSER.DAT file. Once we find it, we will define the user handling function as shown below.

for ntuser in tsk_ntuser_hives:
   uname = ntuser[1].split("/")

open_ntuser = open_file_as_reg(ntuser[2])
try:
   explorer_key = open_ntuser.root().find_key("Software").find_key("Microsoft")
      .find_key("Windows").find_key("CurrentVersion").find_key("Explorer")
   except Registry.RegistryKeyNotFoundException:
      continue
   nt_rec['wordwheel']['data'] += parse_wordwheel(explorer_key, uname)
   nt_rec['typed_path']['data'] += parse_typed_paths(explorer_key, uname)
   nt_rec['run_mru']['data'] += parse_run_mru(explorer_key, uname)
nt_rec['wordwheel']['headers'] = nt_rec['wordwheel']['data'][0].keys()
nt_rec['typed_path']['headers'] = nt_rec['typed_path']['data'][0].keys()
nt_rec['run_mru']['headers'] = nt_rec['run_mru']['data'][0].keys()

Now, pass the dictionary object and its path to the write_html( ) method as shown below.

write_html(report, nt_rec)

Now, define a method that takes a pytsk file handle and passes it to the StringIO Class reads it into the registry class.

def open_file_as_reg(reg_file):
file_size = reg_file.info.meta.size
file_content = reg_file.read_random(0, file_size)
file_like_obj = StringIO.StringIO(file_content)
return Registry.Registry(file_like_obj)

Now, we will define the function to parse and process the WordWheelQuery key in the NTUSER.DAT file as shown below.

def parse_wordwheel(explorer_key, username):
try:
wwq = explorer_key.find_key("WordWheelQuery")
except Registry.RegistryKeyNotFoundException:
return []
mru_list = wwq.value("MRUListEx").value()
   mru_order = []

   for i in xrange(0, len(mru_list), 2):
      order_val = struct.unpack('h', mru_list[i:i + 2])[0]
   if order_val in mru_order and order_val in (0, -1):
      break
   else:
      mru_order.append(order_val)
   search_list = []

   for count, val in enumerate(mru_order):
      ts = "N/A"
      if count == 0:
         ts = wwq.timestamp()
      search_list.append({
         'timestamp': ts,
         'username': username,
         'order': count,
         'value_name': str(val),
         'search': wwq.value(str(val)).value().decode("UTF-16").strip("x00")
})
return search_list

Now, we will define the function to parse and process the TypedPaths key in the NTUSER.DAT file as follows

def parse_typed_paths(explorer_key, username):
try:
typed_paths = explorer_key.find_key("TypedPaths")
except Registry.RegistryKeyNotFoundException:
return []
typed_path_details = []

for val in typed_paths.values():
typed_path_details.append({
"username": username,
"value_name": val.name(),
"path": val.value()
})
return typed_path_details

Now, we will define the function to parse and process the RunMRU key in the NTUSER.DAT file as follows

def parse_run_mru(explorer_key, username):
try:
run_mru = explorer_key.find_key("RunMRU")
except Registry.RegistryKeyNotFoundException:
return []

if len(run_mru.values()) == 0:
return []
mru_list = run_mru.value("MRUList").value()
mru_order = []

for i in mru_list:
mru_order.append(i)
mru_details = []

for count, val in enumerate(mru_order):
ts = "N/A"
      if count == 0:
         ts = run_mru.timestamp()
      mru_details.append({
         "username": username,
         "timestamp": ts,
         "order": count,
         "value_name": val,
         "run_statement": run_mru.value(val).value()
      })
   return mru_details

The following function will now handle the creation of the HTML report:

def write_html(outfile, data_dict):
   cwd = os.path.dirname(os.path.abspath(__file__))
   env = jinja2.Environment(loader=jinja2.FileSystemLoader(cwd))
   template = env.get_template("user_activity.html")

rendering = template.render(nt_data=data_dict)

with open(outfile, 'w') as open_outfile:
open_outfile.write(rendering)

Finally, we can write the HTML document for the report. After running the above script, we will get the HTML document formatting information from the NTUSER.DAT file.

LINK Files

Shortcut files are created when a user or the operating system creates a shortcut file for a file that is frequently used, double-clicked, or accessed from a system drive (such as an attached storage device). These shortcut files are called link files. By accessing these link files, investigators can uncover Windows activity, such as when and where the files were accessed.

Let’s discuss Python scripts, which can be used to retrieve information about these Windows LINK files.

For Python scripts, install the third-party modules pylnk, pytsk3, and pyewf. We can follow these steps to extract information about lnk files.

First, search the system for lnk files.
Then, extract information from the file by iterating.
Now, finally, we need to generate this information into a CSV report.

Python Code

Let’s see how to achieve this using Python code –

First, import the following Python libraries –

from __future__ import print_function
from argparse import ArgumentParser

import csv
import StringIO

from utility.pytskutil import TSKUtil
import pylnk

Now, provide the arguments to the command line handler. Here it will accept three arguments – the first is the path to the evidence file, the second is the type of evidence file, and the third is the desired output path for the CSV report, as shown below.

if __name__ == '__main__':

parser = argparse.ArgumentParser('Parsing LNK files')

parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file")

parser.add_argument('IMAGE_TYPE', help = "Evidence file format",choices = ('ewf', 'raw'))

parser.add_argument('CSV_REPORT', help = "Path to CSV report")

args = parser.parse_args()

main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.CSV_REPORT)

Now, by creating a TSKUtil object to interpret the evidence file and iterate over the file system looking for files ending in lnk. It can be done by defining the main() function as shown below

def main(evidence, image_type, report):
   tsk_util = TSKUtil(evidence, image_type)
   lnk_files = tsk_util.recurse_files("lnk", path="/", logic="endswith")

   if lnk_files is None:
      print("No lnk files found")
      exit(0)
   columns = [
      'command_line_arguments', 'description', 'drive_serial_number',
      'drive_type', 'file_access_time', 'file_attribute_flags',
      'file_creation_time', 'file_modification_time', 'file_size',
      'environmental_variables_location', 'volume_label',
'machine_identifier', 'local_path', 'network_path',
'relative_path', 'working_directory'
]

Now, with the help of the following code, we will iterate over the lnk files by creating a function as shown below

parsed_lnks = []

for entry in lnk_files:
lnk = open_file_as_lnk(entry[2])
lnk_data = {'lnk_path': entry[1], 'lnk_name': entry[0]}

for col in columns:
lnk_data[col] = getattr(lnk, col, "N/A")
lnk.close()
parsed_lnks.append(lnk_data)
write_csv(report, columns + ['lnk_path', 'lnk_name'], parsed_lnks)

Now we need to define two functions: one to open the pytsk file object and the other to write a CSV report, as shown below.

def open_file_as_lnk(lnk_file):
file_size = lnk_file.info.meta.size
file_content = lnk_file.read_random(0, file_size)
file_like_obj = StringIO.StringIO(file_content)
lnk = pylnk.file()
lnk.open_file_object(file_like_obj)
return lnk
def write_csv(outfile, fieldnames, data):
with open(outfile, 'wb') as open_outfile:
csvfile = csv.DictWriter(open_outfile, fieldnames)
csvfile.writeheader()
csvfile.writerows(data)

After running the above script, we will get the information from the discovered lnk file in the CSV report.

Prefetch Files

Windows creates prefetch files whenever an application is first run from a specific location. These files are used to speed up the application startup process. These files have the .PF extension and are stored in the RootWindowsPrefetch folder.

Digital forensics experts can uncover evidence that a program was executed from a specific location, along with detailed information about the user. Prefetch files are useful artifacts for investigators because their entries persist even after a program is deleted or uninstalled.

Let’s discuss a Python script that extracts information from Windows prefetch files, as shown below.

For the Python script, install the third-party modules: pylnk, pytsk3, and unicodecsv. Recall that we used these libraries in the Python scripts discussed in previous chapters.

We must follow the steps below to extract information from the prefetch file.

First, scan for files with the .pf extension, or prefetch files.
Now, perform signature verification to eliminate false positives.
Next, parse the format of the Windows prefetch file. This varies depending on the version of Windows. For example, it’s 17 for Windows XP, 23 for Windows Vista and Windows 7, 26 for Windows 8.1, and 30 for Windows 10.
Finally, we’ll write the parsed results to a CSV file.

Python Code

Let’s see how to achieve this using Python code –

First, import the following Python libraries –

from __future__ import print_function
import argparse
from datetime import datetime, timedelta

import os
import pytsk3
import pyewf
import struct
import sys
import unicodecsv as csv
from utility.pytskutil import TSKUtil

Now, provide an argument to the command line handler. It will accept two arguments: the path to the evidence file, and the type of the evidence file. It also accepts an optional argument for specifying the path to scan for prefetch files.

if __name__ == “__main__”:

parser = argparse.ArgumentParser(‘Parsing Prefetch files’)

parser.add_argument(“EVIDENCE_FILE”, help = “Evidence file path”)

parser.add_argument(“TYPE”, help = “Type of Evidence”, choices = (“raw”, “ewf”))

parser.add_argument(“OUTPUT_CSV”, help = “Path to write output csv”)

parser.add_argument(“-d”, help = “Prefetch directory to scan”, default = “/WINDOWS/PREFETCH”)

args = parser.parse_args()

if os.path.exists(args.EVIDENCE_FILE) and
os.path.isfile(args.EVIDENCE_FILE):
main(args.EVIDENCE_FILE, args.TYPE, args.OUTPUT_CSV, args.d)
else:
print(“[-] Supplied input file {} does not exist or is not a “”file”.format(args.EVIDENCE_FILE))
sys.exit(1)

Now, interpret the evidence file by creating a TSKUtil object and iterating through the file system looking for files ending in .pf. This can be done by defining the main() function as shown below

def main(evidence, image_type, output_csv, path):
   tsk_util = TSKUtil(evidence, image_type)
   prefetch_dir = tsk_util.query_directory(path)
   prefetch_files = None

   if prefetch_dir is not None:
      prefetch_files = tsk_util.recurse_files(".pf", path=path, logic="endswith")

   if prefetch_files is None:
      print("[-] No .pf files found")
      sys.exit(2)
   print("[+] Identified {} potential prefetch files".format(len(prefetch_files)))
   prefetch_data = []

   for hit in prefetch_files: prefetch_file = hit[2]
pf_version = check_signature(prefetch_file)

Now, define a method to verify the signature, as shown below

def check_signature(prefetch_file):
version, signature = struct.unpack("^<2i", prefetch_file.read_random(0, 8))

if signature == 1094927187:
return version
else:
return None

if pf_version is None:
continue
pf_name = hit[0]

if pf_version == 17:
parsed_data = parse_pf_17(prefetch_file, pf_name)
parsed_data.append(os.path.join(path, hit[1].lstrip("//")))
prefetch_data.append(parsed_data)

Now, process the Windows prefetch file. Here we use the Windows XP prefetch file as an example –

def parse_pf_17(prefetch_file, pf_name):
create = convert_unix(prefetch_file.info.meta.crtime)
modify = convert_unix(prefetch_file.info.meta.mtime)
def convert_unix(ts):
if int(ts) == 0:
return ""
return datetime.utcfromtimestamp(ts)
def convert_filetime(ts):
if int(ts) == 0:
return ""
return datetime(1601, 1, 1) + timedelta(microseconds=ts / 10)

Now, extract the data embedded in the prefetch file by using a struct, as shown below.

pf_size, name, vol_info, vol_entries, vol_size, filetime,
   count = struct.unpack("<i60s32x3iq16xi",prefetch_file.read_random(12, 136))
name = name.decode("utf-16", "ignore").strip("/x00").split("/x00")[0]

vol_name_offset, vol_name_length, vol_create,
   vol_serial = struct.unpack("<2iqi",prefetch_file.read_random(vol_info, 20))
   vol_serial = hex(vol_serial).lstrip("0x")
   vol_serial = vol_serial[:4] + "-" + vol_serial[4:]
   vol_name = struct.unpack(
      "<{}s".format(2 * vol_name_length),
prefetch_file.read_random(vol_info + vol_name_offset,vol_name_length * 2))[0]

vol_name = vol_name.decode("utf-16", "ignore").strip("/x00").split("/x00")[0]
return [
pf_name, name, pf_size, create,
modify, convert_filetime(filetime), count, vol_name,
convert_filetime(vol_create), vol_serial ]

Since we have provided a prefetch version for Windows XP, what if it encounters a prefetch version for other Windows? Then it must display an error message as shown below.

elif pf_version == 23:
print("[-] Windows Vista / 7 PF file {} -- unsupported".format(pf_name))
continue
elif pf_version == 26:
print("[-] Windows 8 PF file {} -- unsupported".format(pf_name))
continue
elif pf_version == 30:
print("[-] Windows 10 PF file {} -- unsupported".format(pf_name))
continue

else:
print("[-] Signature mismatch - Name: {}nPath: {}".format(hit[0], hit[1]))
continue
write_output(prefetch_data, output_csv)

Now, define the method to write the results to the CSV report as shown below.


After running the above script, we will retrieve information from the Windows XP version of the prefetch file and enter it into a spreadsheet.

Python Digital Forensics Critical Artifacts in Windows – II

Digital Forensics with Python: Important Artifacts in Windows – II

User Activity

Python Code

LINK Files

Python Code

Prefetch Files

Python Code

Leave a ReplyCancel Reply

Digital Forensics with Python: Important Artifacts in Windows – II

User Activity

Python Code

LINK Files

Python Code

Prefetch Files

Python Code

Related Posts

Python 3 – String isalnum() Method

Pytest test execution results in XML format

Comprehensive analysis of PLT image storage in Python

Leave a ReplyCancel Reply