Python Digital Mobile Device Forensics

Digital Mobile Device Forensics with Python

This chapter explains Python digital forensics on mobile devices and the concepts involved.

Introduction

Mobile device forensics is a branch of digital forensics that involves acquiring and analyzing mobile devices to recover digital evidence necessary for an investigation. This branch differs from computer forensics because mobile devices have a built-in communication system that can provide useful information related to location.

While smartphones are increasingly used in digital forensics, they are still considered nonstandard due to their heterogeneity. On the other hand, computer hardware, such as hard drives, is also considered standard and has evolved as a stable discipline. Within the digital forensics industry, there is much debate about the techniques used on nonstandard devices that hold ephemeral evidence, such as smartphones.

Artifacts that can be extracted from mobile devices

Modern mobile devices hold a vast amount of digital information, compared to older phones that only had call logs or text messages. Therefore, mobile devices can provide investigators with a wealth of information about their users. Some artifacts that can be extracted from mobile devices are listed below.

  • Information – These are useful artifacts that can reveal the owner’s state of mind and even provide previously unknown information to the investigator.
  • Location History – Location history data is a useful artifact that investigators can use to verify a person’s specific location.

  • Installed Applications – By accessing the types of installed applications, investigators can gain insight into the habits and thoughts of mobile users.

Evidence Sources and Python Processing

Smartphones have SQLite databases and PLIST files as primary sources of evidence. In this section, we will use Python to process these sources of evidence.

Analyzing PLIST Files

PLIST (property list) is a flexible and convenient format for storing application data, particularly on iPhone devices. It uses the .plist extension. This type of file stores information about bundles and applications. It can be in two formats: XML and binary. The following Python code will open and read a PLIST file. Note that before proceeding, we must create our own Info.plist file.

First, install a third-party library called biplist using the following command.

Pip install biplist

Now, let’s import some useful libraries to work with plist files —

import biplist
import os
import sys

Now, inside the main method, we can use the following command to read the plist file into a variable —

def main(plist):
try:
data = biplist.readPlist(plist)
except (biplist.InvalidPlistException,biplist.NotBinaryPlistException) as e:
print("[-] Invalid PLIST file - unable to be opened by biplist")
sys.exit(1)

Now, we can read the data from this variable in the console or print it directly.

SQLite Database

SQLite is the primary data repository on mobile devices. SQLite is an in-process library that implements a standalone, serverless, zero-configuration, transactional SQL database engine. Unlike other databases, it’s a zero-configuration database; you don’t need to configure it on your system.

If you are new to or unfamiliar with SQLite databases, you can follow the link www.tutorialspoint.com/sqlite/index.htm. Furthermore, if you want to learn more about SQLite using Python, you can follow the link SQLite. During mobile forensics, we can interact with the mobile device’s sms.db file and extract valuable information from the messages table. Python has a built-in library called sqlite3 for connecting to SQLite databases. You can import the same with the following command.

import sqlite3

Now, with the help of the following command, we can connect to the database, which in this case is called sms.db.

Conn = sqlite3.connect('sms.db')
C = conn.cursor()

Here, C is the cursor object, which allows us to interact with the database.

Now, let’s say we want to execute a specific command, such as retrieving detailed information from the abc table. This can be accomplished using the following command:

c.execute("Select * from abc")
c.close()

The results of the above command will be stored in a cursor object. Similarly, we can use the fetchall() method to dump the results into a variable we can manipulate.

We can use the following command to retrieve the column name data for the messages table in sms.db:

c.execute("pragma table_info(message)")
table_data = c.fetchall()
columns = [x[1] for x in table_data

Note that here we are using the SQLite PRAGMA command, which is a special command used to control various environment variables and status flags in the SQLite environment. In the above command, the fetchall() method returns a tuple of results. The name of each column is stored in the first index of each tuple.

Now, with the help of the following command, we can query all the data in the table and store it in a variable named data_msg.

c.execute("Select * from message")
data_msg = c.fetchall()

The above command will store the data in the variable, and we can also write the data to a CSV file using the csv.writer() method.

iTunes Backups

iPhone forensics can be performed using iTunes backups. Forensic investigators rely on analyzing logical backups of iPhones obtained through iTunes. The AFC (Apple File Connect) protocol is used by iTunes to perform backups. Furthermore, aside from the escrow key record, the backup process does not modify anything on the iPhone.

Now, the question arises: why do digital forensics experts need to understand iTunes backup technology? This is crucial for us to access the suspect’s computer rather than directly the iPhone, as most of the information on the iPhone is likely backed up to the computer when it’s used to sync with the iPhone.

Backup Process and Location

Whenever an Apple product is backed up to a computer, it syncs with iTunes and a specific folder with the device’s unique ID is created. In the latest backup formats, files are stored in subfolders containing the first two hexadecimal characters of the file name. Among these backup files, some are particularly useful, such as info.plist, along with a database called manifest.db. The following table shows the backup locations that vary depending on the operating system of iTunes backups:

Operating System Backup Location
Win7 C:Users[Username]AppDataRoamingAppleComputerMobileSyncBackup
MAC OS X ~/LibraryApplicationSuportMobileSyncBackup/

To process iTunes backups with Python, we first need to identify all backup locations based on our operating system. Then, we will iterate through each backup and read the Manifest.db database.

Now, with the help of the following Python code, we can do the same −

First, import the necessary libraries as shown below

from __future__ import print_function
import argparse
import logging
import os

from shutil import copyfile
import sqlite3
import sys
logger = logging.getLogger(__name__)

Now, provide two positional arguments, INPUT_DIR and OUTPUT_DIR, representing the iTunes backup and desired output folders.

if __name__ == "__main__":
   parser.add_argument("INPUT_DIR",help = "Location of folder containing iOS backups, ""e.g. ~LibraryApplication SupportMobileSyncBackup folder")
   parser.add_argument("OUTPUT_DIR", help = "Output Directory")
   parser.add_argument("-l", help = "Log file path",default = __file__[:-2] + "log")
   parser.add_argument("-v", help = "Increase verbosity",action = "store_true") args = parser.parse_args()

Now, set up the log as follows—

if args.v: logger.setLevel(logging.DEBUG)
else:
logger.setLevel(logging.INFO)

Now, format the message for this log as shown below.

msg_fmt = logging.Formatter("%(asctime)-15s %(funcName)-13s" "%(levelname)-8s %(message)s")
strhndl = logging.StreamHandler(sys.stderr)
strhndl.setFormatter(fmt = msg_fmt)

fhndl = logging.FileHandler(args.l, mode = 'a')
fhndl.setFormatter(fmt = msg_fmt)

logger.addHandler(strhndl)
logger.addHandler(fhndl)
logger.info("Starting iBackup Visualizer")
logger.debug("Supplied arguments: {}".format(" ".join(sys.argv[1:])))
logger.debug("System: " + sys.platform)
logger.debug("Python Version: " + sys.version)

The following line of code will create the necessary folders for the desired output directory by using the os.makedirs( ) function.

if not os.path.exists(args.OUTPUT_DIR):
os.makedirs(args.OUTPUT_DIR)

Now, pass the supplied input and output directories to the main() function as shown below

if os.path.exists(args.INPUT_DIR) and os.path.isdir(args.INPUT_DIR):
main(args.INPUT_DIR, args.OUTPUT_DIR)
else:
logger.error("Supplied input directory does not exist or is not ""a directory")
sys.exit(1)

Now, write the main( ) function, which will further call backup_summary() Function to determine all backups that exist in the input folder.

def main(in_dir, out_dir):
   backups = backup_summary(in_dir)
def backup_summary(in_dir):
   logger.info("Identifying all iOS backups in {}".format(in_dir))
   root = os.listdir(in_dir)
   backups = {}

   for x in root:
      temp_dir = os.path.join(in_dir, x)
      if os.path.isdir(temp_dir) and len(x) == 40:
         num_files = 0
         size = 0

         for root, subdir, files in os.walk(temp_dir):
            num_files += len(files)
            size += sum(os.path.getsize(os.path.join(root, name))
               for name in files)
         backups[x] = [temp_dir, num_files, size]
return backups

Now, print a summary of each backup to the console as shown below –

print("Backup Summary")
print("=" * 20)

if len(backups) > 0:
for i, b in enumerate(backups):
print("Backup No.: {} n""Backup Dev. Name: {} n""# Files: {} n""Backup Size (Bytes): {} n".format(i, b, backups[b][1], backups[b][2]))

Now, dump the contents of the Manifest.db file into a variable called db_items.

try:
db_items = process_manifest(backups[b][0])
except IOError:
logger.warn("Non-iOS 10 backup encountered or " "invalid backup. Continuing to next backup.")
continue

Now, let’s define a function that will get the directory path of the backup:

def process_manifest(backup):
manifest = os.path.join(backup, "Manifest.db")

if not os.path.exists(manifest):
logger.error("Manifest DB not found in {}".format(manifest))
raise IOError

Now, using SQLite3, we will connect to the database through a cursor named c —-.

c = conn.cursor()
items = {}

for row in c.execute("SELECT * from Files;"):
   items[row[0]] = [row[2], row[1], row[3]]
return items

create_files(in_dir, out_dir, b, db_items)
   print("=" * 20)
else:
   logger.warning("No valid backups found. The input directory should be
      " "the parent-directory immediately above the SHA-1 hash " "iOS device backups")
      sys.exit(2)

Now, define the create_files( ) method as follows

def create_files(in_dir, out_dir, b, db_items): msg = "Copying Files for backup {} to {}".format(b, os.path.join(out_dir, b))
logger.info(msg)

Now, iterate over each key in the db_items dictionary—-.

for x, key in enumerate(db_items):
   if db_items[key][0] is None or db_items[key][0] == "":
      continue
   else:
      dirpath = os.path.join(out_dir, b,
os.path.dirname(db_items[key][0]))
   filepath = os.path.join(out_dir, b, db_items[key][0])

   if not os.path.exists(dirpath):
      os.makedirs(dirpath)
      original_dir = b + "/" + key[0:2] + "/" + key
   path = os.path.join(in_dir, original_dir)

   if os.path.exists(filepath):
      filepath = filepath + "_{}".format(x)

Now, use shutil.copyfile( ) method to copy the backed up file, as shown below

try:
   copyfile(path, filepath)
   exceptIOError:
      logger.debug("File not found in backup: {}".format(path))
         files_not_found += 1
   if files_not_found > 0:
      logger.warning("{} files listed in the Manifest.db not" "found in
backup".format(files_not_found))
   copyfile(os.path.join(in_dir, b, "Info.plist"), os.path.join(out_dir, b,
"Info.plist"))
   copyfile(os.path.join(in_dir, b, "Manifest.db"), os.path.join(out_dir, b,
"Manifest.db"))
   copyfile(os.path.join(in_dir, b, "Manifest.plist"), os.path.join(out_dir, b,
"Manifest.plist"))
copyfile(os.path.join(in_dir, b, "Status.plist"),os.path.join(out_dir, b,
"Status.plist"))

With the above Python script, we can obtain the updated backup file structure in the output folder. We can use the pycrypto Python library to decrypt the backup.

Wi-Fi

Mobile devices can connect to the outside world through widely available Wi-Fi networks. Sometimes, devices automatically connect to these open networks.

In the case of an iPhone, the list of open Wi-Fi connections the device has connected to is stored in a PLIST file called com.apple.wifi.plist. This file will contain the Wi-Fi SSID, BSSID, and connection time.

We need to use Python to extract Wi-Fi details from a standard Cellebrite XML report. To do this, we’ll use the API of the Wireless Geographic Logging Engine (WIGLE), a popular platform for finding a device’s location using the name of a Wi-Fi network.

We can access WIGLE’s API using a Python library called requests . It can be installed as follows:

pip install requests

API from WIGLE

We need to register on WIGLE’s website, https://wigle.net/account, to access the free API from WIGLE. Below is a Python script that retrieves user device and connection information using the Wigel API:

First, import the following libraries to handle various tasks:

from __future__ import print_function

import argparse
import csv
import os
import sys
import xml.etree.ElementTree as ET
import requests

Now, provide two positional arguments, INPUT_FILE and OUTPUT_CSV , which will represent the input file with Wi-Fi MAC addresses and the desired output CSV file, respectively.

if __name__ == "__main__":
parser.add_argument("INPUT_FILE", help = "INPUT FILE with MAC Addresses")
parser.add_argument("OUTPUT_CSV", help = "Output CSV File")
parser.add_argument("-t", help = "Input type: Cellebrite XML report or TXT
file",choices = ('xml', 'txt'), default = "xml")
parser.add_argument('--api', help = "Path to API key
file",default = os.path.expanduser("~/.wigle_api"),
type = argparse.FileType('r'))
args = parser.parse_args()

Now the following lines of code will check if the input file exists and is a file. If it does not exist, it will exit the script –

if not os.path.exists(args.INPUT_FILE) or not os.path.isfile(args.INPUT_FILE):
   print("[-] {} does not exist or is not a
file".format(args.INPUT_FILE))
   sys.exit(1)
directory = os.path.dirname(args.OUTPUT_CSV)
if directory != '' and not os.path.exists(directory):
   os.makedirs(directory)
api_key = args.api.readline().strip().split(":")

Now, pass the argument to main as follows −

main(args.INPUT_FILE, args.OUTPUT_CSV, args.t, api_key)
def main(in_file, out_csv, type, api_key):
   if type == 'xml':
      wifi = parse_xml(in_file)
   else:
      wifi = parse_txt(in_file)
query_wigle(wifi, out_csv, api_key)

Now, we will parse the XML file as shown below

def parse_xml(xml_file):
   wifi = {}
   
   print("[+] Opening {} report".format(xml_file))

   xml_tree = ET.parse(xml_file)
   print("[+] Parsing report for all connected WiFi addresses")

   root= xml_tree.getroot()

Now, iterate through the child elements of the root as follows:

for child in root.iter():
if child.tag == xmlns + "model":
if child.get("type") == "Location":
for field in child.findall(xmlns + "field"):
if field.get("name") == "TimeStamp":
ts_value = field.find(xmlns + "value")
try:
ts = ts_value.text
except AttributeError:
continue

Now, we will check if the ‘sid’ string is present in the text of the value.

if "SSID" in value.text:
bssid, ssid = value.text.split("t")
bssid = bssid[7:]
ssid = ssid[6:]

Now, we need to add the BSSID, SSID, and timestamp to the wifi dictionary, as shown below.

if bssid in wifi.keys():

wifi[bssid]["Timestamps"].append(ts)
wifi[bssid]["SSID"].append(ssid)
else:
wifi[bssid] = {"Timestamps": [ts], "SSID":
[ssid],"Wigle": {}}
return wifi

The text parser is much simpler than the XML parser, as shown below.

def parse_txt(txt_file):
wifi = {}
print("[+] Extracting MAC addresses from {}".format(txt_file))

with open(txt_file) as mac_file:
for line in mac_file:
wifi[line.strip()] = {"Timestamps": ["N/A"], "SSID":
["N/A"],"Wigle": {}}
return wifi

Now, let’s use the requests module to make a call to the WIGLE API and move on to the query_wigle() method.

def query_wigle(wifi_dictionary, out_csv, api_key):
   print("[+] Querying Wigle.net through Python API for {} "
"APs".format(len(wifi_dictionary)))
   for mac in wifi_dictionary:

   wigle_results = query_mac_addr(mac, api_key)
def query_mac_addr(mac_addr, api_key):

   query_url = "https://api.wigle.net/api/v2/network/search?"
"onlymine = false&freenet = false&paynet = false" "&netid = {}".format(mac_addr)
   req = requests.get(query_url, auth = (api_key[0], api_key[1]))
   return req.json()

Actually, there is a limit on the number of daily WIGLE API calls. If this limit is exceeded, an error must be displayed, as shown below.

try:
if wigle_results["resultCount"] == 0:
wifi_dictionary[mac]["Wigle"]["results"] = []
continue
else:
wifi_dictionary[mac]["Wigle"] = wigle_results
except KeyError:
if wigle_results["error"] == "too many queries today":
print("[-] Wigle daily query limit exceeded")
wifi_dictionary[mac]["Wigle"]["results"] = []
continue
else:
print("[-] Other error encountered for " "address {}: {}".format(mac,
wigle_results['error']))
wifi_dictionary[mac]["Wigle"]["results"] = []
continue
prep_output(out_csv, wifi_dictionary)

Now, we’ll use the prep_output() method to flatten the dictionary into smaller chunks that are easier to write.

def prep_output(output, data):
   csv_data = {}
   google_map = https://www.google.com/maps/search/

Now, access all the data we have collected so far as follows −

for x, mac in enumerate(data):
   for y, ts in enumerate(data[mac]["Timestamps"]):
      for z, result in enumerate(data[mac]["Wigle"]["results"]):
         shortres = data[mac]["Wigle"]["results"][z]
         g_map_url = "{}{},{}".format(google_map, shortres["trilat"],shortres["trilong"])

Now, we can use The write_csv() function writes the output to a CSV file, just as we did in the script earlier in this chapter.

Leave a Reply

Your email address will not be published. Required fields are marked *