Automating DFIR - How to series on programming libtsk with python Part 9

February 28, 2015, 10:00 pm

≫ Next: Automating DFIR - How to series on programming libtsk with python Part 10

≪ Previous: Forensic Lunch 2/27/15 - Ben LeMere, Lee Whitfield and Robin Keir

Hello Reader,
Welcome to part 9, I really hope you've read parts 1-8 before getting here. If not you will likely get pretty lost at this point. Here are the links to the prior parts in this series so you can get up to speed before going forward, each posts build on the next.

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image

Now that we have that taken care of it's time to start working on DFIR Wizard Version 8! The next step in our roadmap is to recurse through directories to hash files rather than just hashing specific files. To do this we won't need any additional libraries! However, we will need to write some new code and be introduced to some new coding concepts that may take a bit for you to really understand unless you already have a computer science background.

So let's start with a simplified program. We'll strip out all the code after we open up our NTFS partition and store it in the variable filesystemObject and our code now looks like this:

#!/usr/bin/python
# Sample program or step 8 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib

class ewf_Img_Info(pytsk3.Img_Info):
def __init__(self, ewf_handle):
self._ewf_handle = ewf_handle
super(ewf_Img_Info, self).__init__(
url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)

def close(self):
self._ewf_handle.close()

def read(self, offset, size):
self._ewf_handle.seek(offset)
return self._ewf_handle.read(size)

def get_size(self):
return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='Extract the $MFT from all of the NTFS partitions of an E01')
argparser.add_argument(
'-i', '--image',
dest='imagefile',
action="store",
type=str,
default=None,
required=True,
help='E01 to extract from'
)
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)

partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
if 'NTFS' in partition.desc:
filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))

Now previously we were creating file object's by opening a file stored in a known path. If either a) don't know the path to the file or b) want to look at all the files in the file system we need something other than a known path stored in a file object to get us there. What we have instead is a directory object. We create a directory object by called the function open_dir that, filesystemObject inherited from FS_Info, giving it the path to the directory we want to open and then storing the result in a variable. Our code would look like this if we were opening the root directory (aka '/' ) and storing it in a variable named directoryObject.

directoryObject = filesystemObject.open_dir(path="/")

Now we have a directoryObject which makes available to us a list of all the files and directories stored within that directory. To access each entry in this directory listing we need to assign use a loop operator like we did when we went through all of the partitions available in the image. We are going to use the for loop operator, except instead of looping through available partitions we are going to loop through directory entries in a directory. The for loop looks like this:

for entryObject in directoryObject:

This for loop will assign each entry in the root directory to our variable entryObject one a time until it has gone through all of the directory entries in the root directory.

Now we get to do something with the entries returned. To begin with let's print out their names using the info.name.name attribute of entryObject, just like we did earlier in the series.

print entryObject.info.name.name

The completed code looks like this:

#!/usr/bin/python
# Sample program or step 8 in becoming a DFIR Wizard!
# No license as this code is simple and free!
import sys
import pytsk3
import datetime
import pyewf
import argparse
import hashlib

class ewf_Img_Info(pytsk3.Img_Info):
def __init__(self, ewf_handle):
self._ewf_handle = ewf_handle
super(ewf_Img_Info, self).__init__(
url="", type=pytsk3.TSK_IMG_TYPE_EXTERNAL)

def close(self):
self._ewf_handle.close()

def read(self, offset, size):
self._ewf_handle.seek(offset)
return self._ewf_handle.read(size)

def get_size(self):
return self._ewf_handle.get_media_size()
argparser = argparse.ArgumentParser(description='List files in a directory')
argparser.add_argument(
'-i', '--image',
dest='imagefile',
action="store",
type=str,
default=None,
required=True,
help='E01 to extract from'
)
args = argparser.parse_args()
filenames = pyewf.glob(args.imagefile)
ewf_handle = pyewf.handle()
ewf_handle.open(filenames)
imagehandle = ewf_Img_Info(ewf_handle)

partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len
if 'NTFS' in partition.desc:
filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512))
directoryObject = filesystemObject.open_dir(path="/")
for entryObject in directoryObject:
print entryObject.info.name.name

If you were to run this against the SSFCC-Level5.E01 image you would get the following output:

C:\Users\dave\Desktop>python dfirwizard-v8.py -i SSFCC-Level5.E01
0 Primary Table (#0) 0s(0) 1
1 Unallocated 0s(0) 8064
2 NTFS (0x07) 8064s(4128768) 61759616
$AttrDef
$BadClus
$Bitmap
$Boot
$Extend
$LogFile
$MFT
$MFTMirr
$Secure
$UpCase
$Volume
.
BeardsBeardsBeards.jpg
confession.txt
HappyRob.png
ILoveBeards.jpg
ItsRob.jpg
NiceShirtRob.jpg
OhRob.jpg
OnlyBeardsForMe.jpg
RobGoneWild.jpg
RobInRed.jpg
RobRepresenting.jpg
RobToGo.jpg
WhatchaWantRob.jpg
$OrphanFiles

This code works and prints for us all of the files and directories located within the root directory we specified. However, most forensic images contain file systems that do have sub directories, sometimes thousands of them. To be able to go through all the files and directories stored within a partition we need to change our code to be able to determine if a directory entry is a file or a directory. If the entry is a directory then we need to check the contents of the sub directory as well. We will do this check over and over again until we have checked all the directories within the file system.

To write this kind of code we need to make use of a common computer science concept called recursion. In short instead of just writing the code to try to guess how many sub directories exist in an image we will write a function that will call itself every time we find a sub directory. That function will then call itself every time it finds a sub directory over and over until it reaches the end of that particular directory path and then unwind back.

Before we get into the new recursive function let's expand on part 7 by adding some more command line options. The first command line option we will add will allow the user to specify the directory we will start from when we do our hashing. I am not going to repeat what the code means as we covered it in part 7 so here is the new command line argument:

argparser.add_argument(

'-p', '--path',

dest='path',

action="store",

type=str,

default='/',

required=False,

help='Path to recurse from, defaults to /'

)

What is new here that I need to explain is passing in a default. If the user does not provide a path to search then the default value we specify in the default setting will be stored in the variable path. Otherwise if the user does specify a path then that path will be stored in the path variable instead.

The next command line option we will give the user will be to specify the name of the file we will write our hashes to. It wouldn't be very useful if all of our output went to the command line and vanished once we exited. Most forensic examiners I know eventually take their forensic tool output to programs like Microsoft Word or Excel to get it into something that whomever asked us to do our investigation can read. The code looks like this:

argparser.add_argument(

'-o', '--output',

dest='output',

action="store",

type=str,

default='inventory.csv',

required=False,

help='File to write the hashes to'

)

You'll see here again I made a default value if the user didn't provide an output file name.So if the user provides a value to output it will become the file name, otherwise our program will create a file named inventory.csv to store the results.

Now that we've added additional command line arguments, we need to use them. First we need to assign the directory we are going to start our hashing from to a variable like so:

dirPath = args.path

Next we need to open up our output file name for writing.

outfile =open(args.output,'w')

With this in place we can then change our directory object to use our dirPath variable rather than a hard coded parameter:

directoryObject = filesystemObject.open_dir(path=dirPath)

We will then print out to the command window the directory we are starting with, so that the user knows what is happening:

print"Directory:",dirPath

Last we will call our new function:

directoryRecurse(directoryObject,[])

You can see that we have named our new function directoryRecurse and that its taking two parameters. The first is the directoryObject we made so that it can start at whatever directory the user specified or the root directory is non was specified. The second parameter my look odd but its python's way of specifying an empty list. We will pass in a empty list because when the function calls itself for the directories it finds it will need a list to keep track of the full path that lead to it.

Now let's take a step back to explain the need for this list. A file or a directory has one thing in common as its stored on the disk, it only knows the parent directory it's in but not any above it. What do I mean by that? Let's say we are looking at a directory we commonly deal with like \windows\system32\config. This is where Windows stores non user registries and we work from it often in our work. Now to us as the user it would seem that it is logical that that is the full path to the file, but as the data is stored on the disk it is a different story. Config knows it's parent directory is system32, but only by inode, system32 knows it's parent directory is Windows, but only by inode and Windows knows it's parent directory is the root of the drive. What I'm trying to get across here is that we can't look at a directory we run across within a file system and just pull back the full path to the file from the file system itself (in NTFS we mean the Master File Table or MFT), instead we have to track that ourselves. So our empty list we are passing in at this point will keep track of the directory paths that make up the full path the files and directories we will be hashing.

With that explained let's move on to the real work involved, the directoryRecurse function. It starts with a function definition just like the other functions we have made so far:

defdirectoryRecurse(directoryObject, parentPath):

you can see that we have called the directoryObject we are passing in directoryObject and the empty list we are assigning to a variable named parentPath. Next we need to do something with what we passed in by iterating through all the entries in the directorObject passed in.

for entryObject in directoryObject:

if entryObject.info.name.name in [".", ".."]:

continue

You'll notice though that we have a new test that is contained in an if statement. We are making sure that the directory entry we are looking at for this iteration of the for loop is not '.' or '..'. These are special directory entries that exist to allow us to always be able to refer to the directory itself (.) and the parent directory (..). If we were to let our code trying to keep calling the parent of itself one of two things could happen, 1. We could enter an infinite loop of going back into the same directory or 2. We reach a finite amount of recursion before we exit with an error . In the case of Python 2.7 the second answer is the correct one. So if our directory entry has . or .. has a file name we will skip it by using the continue operator. Continue tells python to skip anything code let in this iteration of the loop and move on to the next directory entry to process.

The next thing we are going to do is make use of the Python try function.

try:

f_type = entryObject.info.meta.type

except:

print"Cannot retrieve type of",entryObject.info.name.name

continue

Try will let us attempt a line of code that we know may return an error that would normally cause our program to terminate. Instead of terminating though our code will call the except routine and perform whatever tasks we specify. In the case of our code above we are attempting to assign the type of directory entry we are dealing with to the variable f_type. If the variable is assigned the we will move on through our code, if it does not then we will print an error letting the user know that we have a directory entry we cant handle and then skip to the next directory entry to process. To read more about Try/Except go here: https://docs.python.org/2/tutorial/errors.html#handling-exceptions

To see the full list of type's that pytsk will return go here: http://www.sleuthkit.org/sleuthkit/docs/api-docs/tsk__fs_8h.html#a354d4c5b122694ff173cb71b98bf637b but what we care about for this program is if the directory entry is a directory or not. To test this we will use the if conditional operator to see if the contents of f_type are equal to the value we know means a directory. The value that represents that a directory entry is a directory is 0x02 but we can use the constant provided to us by libtsk as TSK_FS_META_TYPE DIR. The if statement then looks like this.

if f_type == pytsk3.TSK_FS_META_TYPE_DIR:

Before we go farther there is one thing we need to do. We need to create a variable to store the current full path we are working with in a printable form for output. To do that we will make using of some string formatting you first saw in part 1.

filepath ='/%s/%s'% ('/'.join(parentPath),entryObject.info.name.name)

Here we are assigning to the variable filepath a string that is made up of two variables. The first is our parentPath list, which in the first iteration of this function is empty, and the second which is the name of the file or directory we are currently processing. We use the join operator to place a / between the name of any directories stored in parentPath and then the strings formatting of %s will convert whatever is returned into a string and then store the whole result in filepath. The end result is the full path to the file as the user expects it.

With that out of the way let's look at what happens if the directory entry we are processing is in fact a directory.

sub_directory = entryObject.as_directory()

parentPath.append(entryObject.info.name.name)

directoryRecurse(sub_directory,parentPath)

parentPath.pop(-1)

print"Directory: %s"% filepath

The first thing we are doing is getting a directory object to work with out of our file object. We do this using the as_directory function which will return a directory object if the file object it is launched from is in fact a directory. If it's not the function will throw an error and exit the program (unless you do this in a Try/Except block). We are storing our newly created directory object in a variable named sub_directory.

The next thing we are doing is appending a value to our empty list parentPath. The append operator will take a value, in this case the name of the directory we are processing, and place it at the end of the list. This makes sure the directories listed in our list will always be in the order we explored them.

The next thing we are doing is calling the same function we started with, but now with our sub_directory object as the first parameter and our populated parentPath variable as the second parameter instead of an empty list. From this point forward every directory we encounter will be spun off into a new sub directory until we've reached the last directory in the chain and then it will unwind back to the beginning.

When our function returns from checking the directory we need to remove it from the full path list and we can achieve that with the pop function. In this case we are telling the pop function to remove the index value -1 which translates to the last element added on to the list.

Last for the user we are printing the full path to the directory we have finished with so they know we are still running.

Next we need to write the function of what to do with the files within these directories we are recursing. This wouldn't be a very useful program if it all it did was find directories! To do this we will check f_type again and make sure that the directory entry we are checking actually contains some data to hash:

elif f_type == pytsk3.TSK_FS_META_TYPE_REG and entryObject.info.meta.size !=0:
filedata = entryObject.read_random(0,entryObject.info.meta.size)
md5hash = hashlib.md5()
md5hash.update(filedata)
sha1hash = hashlib.sha1()
sha1hash.update(filedata)

Here you can see we are calling the constant provided by libtsk of TSK_FS_META_TYPE_REG which means this is a regular file. Next we are making sure that our file contains some data, it is not a 0 byte file. If we were to pass a 0 byte file to read_random, as we do in the next line, it would throw an error and exit the program.

After we have verified that we do in fact have a regular file to parse and that it has some data we will open it up and hash it as we saw in part 8. Now, let's do something with this data other than print it to the command line. To do this we are going to bring in a new python library called 'csv' with an import command at the beginning of our program:

import csv

The CSV python library is very useful! It's care of all of the common tasks of reading and writing a coma separated value file for us, including handling the majority of special exceptions when dealing with strange values. We are going to create a csv object that will write out our data into csv form for us using the following statement:

wr = csv.writer(outfile, quoting=csv.QUOTE_ALL)

So wr now contains a csv writer object with two parameters. The first is telling it to write out to the file object outfile which we set above and the second is telling it to place all the values we are writing out into quotes so that we don't have to worry as much as about what is contained within a value breaking the csv format.

Next we need to write out a header for output file so that the user will understand what columns they are looking at.

outfile.write('"Inode","Full Path","Creation Time","Size","MD5 Hash","SHA1 Hash"\n')

If you modify this program to include additional data in your inventory, make sure to update this header as well!

Next returning to our function we need to write out the hash we just generated, along with other helpful information, to our csv output file

wr.writerow([int(entryObject.info.meta.addr),
'/'.join(parentPath)+entryObject.info.name.name,
datetime.datetime.fromtimestamp(entryObject.info.meta.crtime).
      strftime('%Y-%m-%d%H:%M:%S'),
int(entryObject.info.meta.size),
md5hash.hexdigest(),
sha1hash.hexdigest()])

You can see here that we've taken all the variables we were previously printing out to the command window and we placing it in a list operator, the [ and ] that wraps around all the paramters, to have their results returned as a string. What will come out of this function writerow is a single, fully quoted, csv entry for the file we just hashed. We can modify this line to include any other data we want to know about the file we are processing.

Next we need to handle a special condition, 0 byte files. We already know that read_random will fail if we try to read a 0 byte file, luckily we don't have to read a 0 byte file to know its hash value. The hash of a 0 byte file is well known for both md5 and sha1 so we can just create a condition to test for them and fill them in!

elif f_type == pytsk3.TSK_FS_META_TYPE_REG and entryObject.info.meta.size ==0:

wr.writerow([int(entryObject.info.meta.addr),
'/'.join(parentPath)+entryObject.info.name.name,
datetime.datetime.fromtimestamp(entryObject.info.meta.crtime).strftime('%Y-%m-%d%H:%M:%S'),
int(entryObject.info.meta.size),
"d41d8cd98f00b204e9800998ecf8427e",
"da39a3ee5e6b4b0d3255bfef95601890afd80709"])

So above we are making sure we are dealing with a regular file that has 0 bytes of data. Once we know that is the case we are calling the same writerow function but now instead of generating hashes we are filling in the already known values for 0 byte files. The MD5 sum of a 0 byte file is d41d8cd98f00b204e9800998ecf8427e and the SHA1 sum of a 0 byte file is da39a3ee5e6b4b0d3255bfef95601890afd80709.

The only thing left to do now is to put in the except conditional for this entire section:

exceptIOErroras e:

print e

continue

Telling our program that if we got some kind of error in doing any of this to print the error to command window and move on to the next directory entry.

That's it! We now have a version of DFIR Wizard that will recurse through every directory we specify in an NTFS partition, hash all the files contained with in it and then write that out long with other useful data to a csv file we specify.

You can grab the full code here from our series Github: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v8.py

Pasting in the all the code will break the blog formatting so I can't do that this time.

In part 10 of this series we will extend this further to search for files within an image and extract them!

Special thanks to David Nides and Kyle Maxwell who helped me through some late night code testing!

↧

Automating DFIR - How to series on programming libtsk with python Part 10

March 3, 2015, 9:19 pm

≫ Next: Forensic Lunch 3/20/15 - James Carder and Eric Zimmerman

≪ Previous: Automating DFIR - How to series on programming libtsk with python Part 9

Hello Reader,
If you just found this series I have good news! There is way more of it to read and you should start at Part 1. See the links to all the parts below:

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image

For those of you who are up to date let's get going! In this part of the series we are going to take our recursive hashing script and make it even more useful. We are going to allow the user to search for the kind of files they want to hash with a regular expression enabled search and give them the option to extract those files as they are found back to their original path under your output directory. Ready? Let's go!

First we will need to import two more libraries, so many libraries!. The good news is these are still standard python system libraries, so there is nothing new to install. The first library we will bring is called 'os' which gives us os related functions and will map the proper function based on the os you are running on. The second library called 're' will provide us with regular expression support when evaluating our search criteria. We bring in those libraries as before with an import command:

import os
import re

Now we need to add two more command line arguments to let our user take advantage of the new code we are about to write:

argparser.add_argument(
'-s', '--search',
dest='search',
action="store",
type=str,
default='.*',
required=False,
help='Specify search parameter e.g. *.lnk'
)
argparser.add_argument(
'-e', '--extract',
dest='extract',
action="store_true",
default=False,
required=False,
help='Pass this option to extract files found'
)

Our first new argument is letting the user provide a search parameter. We are storing the search parameter in the variable args.search and if the user does not provide one we default to .* which will match anything.

The second argument is using a different variable then the rest of our options. We are setting the variable args.extract as a True or False value with the store_true option under action. If the user provides this argument then the variable will be true and the files matched will be extracted, if the user does not then the value will be false and the program will only hash the files it finds.

It's always good to show the user that the argument we received from them is doing something so let's add two lines to see if we have a search term and print it:

if not args.search == '.*':
print "Search Term Provided",args.search

Remember that .* is our default search term, so if the value stored in args.search is anything other than .* our program will print out the search term provided, otherwise it will just move on.

Our next changes all happen within our directoryRecurse function. First we need to capture the full path where the file we are looking at exists. We will do this by combining the partition number and the full path that lead to this file in order to make sure its unique between partitions.

outputPath ='./%s/%s/' % (str(partition.addr),'/'.join(parentPath))

Next we will go into the else if statement we wrote in the prior post to handle regular files with non zero lengths. We will add a new line of code to the beginning of the code block that gets executed to do our regular expression search as follows:

elif f_type == pytsk3.TSK_FS_META_TYPE_REG and entryObject.info.meta.size != 0: searchResult = re.match(args.search,entryObject.info.name.name)

You can see we are use the re library here and calling the match function it provides. We are providing two arguments to the match function. The first is the search term the user provided us and the second is the file name of the regular, non zero sized file we are inspecting. If the regular expression provided by the user is a match then a match object will be returned and stored in the searchResult variable, if there is not a match then the variable will contain no data. We write a conditional to test this result next:

if not searchResult:
continue

This allows us to skip any file that did not match the search result provided. If the user did not specify a search term our default value of .* will kick in and everything will be a match.

Our last modification revolves around extracting out the files if our user selected the option to. The code looks like this

if args.extract == True:
if not os.path.exists(outputPath):
os.makedirs(outputPath)
extractFile = open(outputPath+entryObject.info.name.name,'w')
extractFile.write(filedata)
extractFile.close

The first thing we are doing is checking to see if the user has set the extract variable and caused it to be set to True. If they do then we will extract the file, if not we skip it. If it is true the first thing we do is make use of the os library's path.exists function. This will allow us to look at the output directory we want to create (and set in the outputPath variable above) already exists. If it does we can move on, if it does not than we call another os library provided function named makedirs. makedirs is nice because it will recursively create the path you specify so you don't have to loop through all the directories in between if they don't exist.

Now that our output path exists its time to extract it. We are modifying our old extractFile variable and now we are appending on our outputPath to the filename we want to create. This will place the file we are extracting in to the directory we have created. Next we write the data out to it as before and then close the handle since we will be reusing it.

If I was to run this program against the Level5 image we've been working with and specify both the extraction flag and provide a search term of .*jpg it would look like this

C:\Users\dave\Desktop>python dfirwizard-v9.py -i SSFCC-Level5.E01 -e -s .*jpg
Search Term Provided .*jpg
0 Primary Table (#0) 0s(0) 1
1 Unallocated 0s(0) 8064
2 NTFS (0x07) 8064s(4128768) 61759616
Directory: /
Directory: /$Extend/$RmMetadata/$Txf
Directory: /$Extend/$RmMetadata/$TxfLog
Directory: /$Extend/$RmMetadata
Directory: //$Extend
match BeardsBeardsBeards.jpg
match ILoveBeards.jpg
match ItsRob.jpg
match NiceShirtRob.jpg
match OhRob.jpg
match OnlyBeardsForMe.jpg
match RobGoneWild.jpg
match RobInRed.jpg
match RobRepresenting.jpg
match RobToGo.jpg
match WhatchaWantRob.jpg
Directory: //$OrphanFiles

On my desktop there would be a folder named 2 and underneath that the full path to the file that matched the search term would exist.

That's it! Part 9 had a lot going on but now that we've built the base for our recursion it gets easier from here. The code for this part is located on the series Github here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v9.py

Next in part 11 we will do the same thing from live systems but allow our code to enumerate all the physical disks present rather than hardcoding an option.

↧

Forensic Lunch 3/20/15 - James Carder and Eric Zimmerman

March 20, 2015, 2:15 pm

≫ Next: Automating DFIR - How to series on programming libtsk with python Part 11

≪ Previous: Automating DFIR - How to series on programming libtsk with python Part 10

Hello Reader!,
We had another great Forensic Lunch! This broadcast we had:

James Carder of the Mayo Clinic, @carderjames, talking all about automating your response process to separate the random attacks from sophisticated attacks. You can hear James talk about this and much more at the SANS DFIR Summit where he'll be a panelist! If you want to work with James Mayo Clinic is hiring.

Mayo Clinic Infosec and IR Jobs: http://www.mayo-clinic-jobs.com/go/information-technology-engineering-and-architecture-jobs/255296/?facility=MN
Contact James Carder: carder.james@mayo.edu

Special Agent Eric Zimmerman of the FBI, @EricRZimmerman , talking about his upcoming in depth Shellbags talk at the SANS DFIR Summit as well as his new tool called Registry Explorer. RE and Eric's research into windows registries will be continued in the next broadcast. Whether you are interested in registries from a research, academic or investigative perspective this is a must see, and FREE, tool!

Eric's Blog: http://binaryforay.blogspot.com/
Eric's Github:https://github.com/EricZimmerman
Registry Explorer: http://binaryforay.blogspot.com/p/software.html

You can watch the broadcast here on Youtube: https://www.youtube.com/watch?v=lj7cMHySGSE

Or in the embedded player below:

↧

Automating DFIR - How to series on programming libtsk with python Part 11

March 22, 2015, 7:34 pm

≫ Next: Forensic Lunch 4/3/15 - Devon Kerr - WMI and DFIR and Automating DFIR

≪ Previous: Forensic Lunch 3/20/15 - James Carder and Eric Zimmerman

Hello Reader,
I had a bit of a break thanks to a long overdue vacation but I'm back and the code I'll be talking about today has been up on the github repository for almost 3 weeks, so if you ever want to get ahead go there as I write the code before I try to explain it! Github repository is here: https://github.com/dlcowen/dfirwizard

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image

In this post we are going to augment the script on part 10 which went through an image and search/extracted files from all the NTFS partition in an image, and now we are going to do the same against all the NTFS partitions on a live system. You can obviously tweak this for any other file system but we will get to that in later posts in this series.

The first thing we need is a way to figure out what partitions exist on a live system in cross platform way so our future code can be tweaked to run anywhere. For this I choose the python library psutil which can provide a wealth of information about a system its running on, included information about available disks and partitions, you can read all about it here: https://pypi.python.org/pypi/psutil

To bring it into our program we need to call the import function again:

import psutil

and then because we are going to work against a live running system again we need our old buddy admin

import admin

which if you remember from part 5 will auto escalate our script to administrator just in case we forgot to run it as such.

We are going to strip out the functions we need to find all the parts of a forensic image and replace it with out code to test for administrative access:

if not admin.isUserAdmin():
admin.runAsAdmin()
sys.exit()

Next we replace the functions we called to get a partition table from a forensic image with a call to psutil to return a listing of paritions and iterate through them. The code looks like the following which I will explain:

partitionList = psutil.disk_partitions()
for partition in partitionList:
imagehandle = pytsk3.Img_Info('\\\\.\\'+partition.device.strip("\\"))
if 'NTFS' in partition.fstype:

So here instead of calling pytsk for a partition table we are calling psutil.disk_partitions which will return a list of partitions that are available to the local system. I much prefer this method than trying to iterate through all volume letters as we will get back just those partitions available as well as what file system they are seen as running as. Our list of active partitions will be stored in the varaible partitionList. Next we will iterate through the partitions using the for operator storing each partition returned into the partition variable. Next we are creating a pytsk3 Img_Info object for each partition returned but only continuing if psutil recognized the partition is NTFS.

The next thing we are changing is our try catch blog in our recursive directory function. Why? I found in my testing that live systems react much differently than forensic images in setting certain values in libtsk. So rather than using entryObject.info.meta.type to determine if I'm dealing with a regular file I am using entryObject.info.name.type which seem to always be set regardless if its a live system or a forensic image. I'm testing to see if I can capture the type of the file and it's size here as there are a lot of interesting special files that only appear at run time that will throw an error if you try to get their size.

try:
f_type = entryObject.info.name.type
size = entryObject.info.meta.size
except Exception as error:
print "Cannot retrieve type or size of",entryObject.info.name.name
print error.message
continue

So in the above code I'm getting the type of file (lnk, regular, etc..) and it's size and if I can't I'm handling the error and printing out the error before continuing on. You will see errors, live systems are an interesting place to do forensics.

I am now going to make a change I alluded to earlier on in the series. We are going to buffer out reads and writes so we don't crash our of our program because we are trying to read a massive file into memory. This wasn't a problem in our first examples as we were working from small test images I made before, but now they we are dealing with real systems and real data we need to handle our data with care.

Our code looks as follows:

BUFF_SIZE = 1024 * 1024
offset=0
md5hash = hashlib.md5()
sha1hash = hashlib.sha1()
if args.extract == True:
if not os.path.exists(outputPath):
os.makedirs(outputPath)
extractFile = open(outputPath+entryObject.info.name.name,'w')
while offset < entryObject.info.meta.size:
available_to_read = min(BUFF_SIZE, entryObject.info.meta.size - offset)
filedata = entryObject.read_random(offset,available_to_read)
md5hash.update(filedata)
sha1hash.update(filedata)
offset += len(filedata)
if args.extract == True:
extractFile.write(filedata)

if args.extract == True:
extractFile.close

First we need to determine how much data we want to read or write at one time from a file. I've copied several other examples I've found and I'm setting that amount to 1 meg of data at a time by setting the variable BUFF_SIZE equal to 1024*1024 or one megabyte. Next we need to keep track of where we are in the file we are dealing with, we do that by creating a new variable called offset and setting the offset to 0 to start with.

You'll notice that we are creating our hash objects, directories and file handles before we read in any data. That is because we want to do all of these things one time prior to iterating through the contents of a file. If a file is a gigabyte in size then our function will be called 1,024 times and we just want one hash and one output file to be created.

Next we starting a while loop which will continue to execute until our offset is greater or equal to the size of our file, meaning we've read all the data within it. Now files are not guaranteed to be allocated in 1 meg chunks, so to deal with that we are going to take advantage of a python function called min. Min returns the smaller of to values presented which in our code is the size of the buffer compared to the remaining data left to read (the size of the file - our current offset). Whichever value is smaller will be stored in the variable available_to_read.

After we know how much data we want to read in this execution of our while loop we are going to read it as before from our entryObject passing in the offset to start from and how much data to read, storing the data read into the variable filedata. We are then calling the update function provided by our hashing objects. One of the nice things our the hashlibs provided by python is that if you provide additional data to an already instantiated object it will just continue to build the hash rather than having to read it all in at once.

Next we are incrementing our offset by adding to itself the length of data we just read so we will skip past it on the next while loop execution. Finally we write the data out to our output file if we elected to extract the files we are searching for.

I've added one last bit of code to help me catch any other weirdness that may seep through.

else:
print "This went wrong",entryObject.info.name.name,f_type

An else to look for any condition that does not match one of existing if statements.

That's it! You now have a super DFIR Wizard program that will go through all the active NTFS partitions on a running system and pull out and hash whatever files you want!

You can find the complete code here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v10.py

In the next post we will talk about parsing partitions types other than NTFS and then go into volume shadow copy access!

↧

Forensic Lunch 4/3/15 - Devon Kerr - WMI and DFIR and Automating DFIR

April 3, 2015, 4:16 pm

≫ Next: National CCDC 2015 Red Team Debrief

≪ Previous: Automating DFIR - How to series on programming libtsk with python Part 11

Hello Reader,

We had another great Forensic Lunch!

Guests this week:
Devon Kerr talking about his work at Mandiant/Fireeye and his research into WMI for both IR and attacker usage.

You can email Devon here: devon.kerr@mandiant.com
You can follow him on twitter here: https://twitter.com/_devonkerr_
Get cool tools from the Mandiant github here: https://github.com/mandiant
Watch Devon talk more about WMI and IR at the SANS DFIR Summit: http://dfir.to/1BvOw7G

Matthew and I going into the Automating DFIR series and our upcoming talk at CEIC

We are on the CEIC agenda here:

https://www.guidancesoftware.com/ceic/Pages/ceic-agenda-table.aspx

and here is the latest post in the Automating DFIR Series:

http://www.hecfblog.com/2015/03/automating-dfir-how-to-series-on_22.html

You can watch the show on Youtube: https://www.youtube.com/watch?v=y-xtRkwaP2g

or below!

↧

National CCDC 2015 Red Team Debrief

April 26, 2015, 7:56 pm

≫ Next: Automating DFIR - How to series on programming libtsk with python Part 12

≪ Previous: Forensic Lunch 4/3/15 - Devon Kerr - WMI and DFIR and Automating DFIR

Hello Reader,
Here is this years Red Team debrief. If you have questions please leave them below.

https://drive.google.com/file/d/0B_mjsPB8uKOAWnY5ZERHX0RUWEU/view?usp=sharing

↧

Automating DFIR - How to series on programming libtsk with python Part 12

May 10, 2015, 8:04 pm

≫ Next: Automating DFIR - How to series on programming libtsk with python Part 13

≪ Previous: National CCDC 2015 Red Team Debrief

Hello Reader,
How has a month passed since this last entry? To those reading this I do intend to continue this series for the foreseeable future so don't give up! In this part we will talk about accessing non NTFS partitions and the file system support available to you in pytsk.

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system

In this series so far we've focused on NTFS because that's where most of us spend our time investigating. However, the world is not Windows alone and luckily for us the sleuthkit libraries that pytsk binds to is way ahead of us. Here is the full list of file systems that the sleuthkit library supports:

NTFS
FAT12
FAT16
FAT32
exFAT
UFS1 (FreeBSD, OpenBSD, BSDI ...)
UFS1b (Solaris - has no type)
UFS2 - FreeBSD, NetBSD.
Ext2
Ext3
SWAP
RAW
ISO9660
HFS
Ext4
YAFFS2

Now that is what the current version of the sleuthkit supports, pytsk3 however is compiled against an older version. I've tested the following file systems to have worked with pytsk3

NTFS
FAT12
FAT16
FAT32
EXT2
EXT3
EXT4
HFS

Based on my testing I know that ExFAT does not appear to be supported in this binding and the testing of reader Hans-Peter Merkel I know that YAFFS2 is also not currently supported. I'm sure in the future when the binding is updated these file systems will then come into scope.

So now that we know what we can expect to work let's change our code from part 10 to work against any supported file system, and allow it to work with multiple image types to boot.

What you will need:

I'm using a couple of sample images from the CFReDS project for this part as they have a different partition for different variants of the same file system type. For instance the ext sample image we will use has ext2, ext3, and ext4 partitions all on one small image. Pretty handy!

Ext example image: http://www.cfreds.nist.gov/dfr-images/dfr-01-ext.dd.bz2

Fat example image: http://www.cfreds.nist.gov/dfr-images/dfr-01-fat.dd.bz2

HFS example image: http://www.cfreds.nist.gov/dfr-images/dfr-01-osx.dd.bz2

NTFS example image: http://www.cfreds.nist.gov/dfr-images/dfr-01-ntfs.dd.bz2

The first thing we will need to change is how we are opening our images to support multiple image types. We are going to be working with raw and e01 images all the time and keeping two separate programs to work with each seems dumb. Let's change our code to allow us to specify which kind of image we are working with and in the future we may automate that as well!

We need to add a new required command line option where we specify the type of image we are going to be working with:

argparser.add_argument( '-t', '--type',
dest='imagetype',
action="store",
type=str,
default=False,
required=True,
help='Specify image type e01 or raw'
)

We are defining a new flag (-t or --type) to pass in the type of image we are dealing with. We are then storing our input into the variable imagetype and making this now required.

Now we need to test our input and call the proper Image Info class to deal with it. First let's deal with the e01 format. We are going to move all the pyewf specific code into this if block:

if (args.imagetype == "e01"): filenames = pyewf.glob(args.imagefile) ewf_handle = pyewf.handle() ewf_handle.open(filenames)

Next we are going to define the code to work with raw images:

elif (args.imagetype == "raw"): print "Raw Type"
imagehandle = pytsk3.Img_Info(url=args.imagefile)

One last big change to make and all the rest of our code will work:

for partition in partitionTable: print partition.addr, partition.desc, "%ss(%s)" % (partition.start, partition.start * 512), partition.len try: filesystemObject = pytsk3.FS_Info(imagehandle, offset=(partition.start*512)) except: print "Partition has no supported file system" continue print "File System Type Dectected ",filesystemObject.info.ftype

We are moving our FS_Info call to open the file system into a try/except block so that if the partition type is not supported our program won't exit on error. Why do we have to test each partition? Because 1. We can't trust the partition description to always tell us what file system is in it, Windows 8 for instance changed them, and the only method tsk makes available to us to determine the file system is within the FS_Info Class. So we will see if pytsk supports opening any partition we find and if it does not we will print an error to the user. It we do support the file system type then we will print the type detected and the rest of our code will work with no changes needed!

Le's see what this looks like on each of the example images I linked at the beginning of the post.

FAT:

E:\development>python dfirwizard-v11.py -i dfr-01-fat.dd -t rawRaw Type0 Primary Table (#0) 0s(0) 1Partition has no supported file system1 Unallocated 0s(0) 128Partition has no supported file system2 DOS FAT12 (0x01) 128s(65536) 16384File System Type Dectected TSK_FS_TYPE_FAT16Directory: /3 DOS FAT16 (0x06) 16512s(8454144) 65536File System Type Dectected TSK_FS_TYPE_FAT16Directory: /4 Win95 FAT32 (0x0b) 82048s(42008576) 131072File System Type Dectected TSK_FS_TYPE_FAT32Directory: /5 Unallocated 213120s(109117440) 1884033Partition has no supported file system

Ext:

E:\development>python dfirwizard-v11.py -i dfr-01-ext.dd -t raw
Raw Type
0 Primary Table (#0) 0s(0) 1
Partition has no supported file system
1 Unallocated 0s(0) 61
Partition has no supported file system
2 Linux (0x83) 61s(31232) 651175
File System Type Dectected TSK_FS_TYPE_EXT2
Directory: /
Cannot retrieve type of Bellatrix.txt
3 Linux (0x83) 651236s(333432832) 651236
File System Type Dectected TSK_FS_TYPE_EXT3
Directory: /
4 Linux (0x83) 1302472s(666865664) 651236
File System Type Dectected TSK_FS_TYPE_EXT4
Directory: /
5 Unallocated 1953708s(1000298496) 143445
Partition has no supported file system

HFS:

E:\development>python dfirwizard-v11.py -i dfr-01-osx.dd -t raw
Raw Type
0 Safety Table 0s(0) 1
Partition has no supported file system
1 Unallocated 0s(0) 40
Partition has no supported file system
2 GPT Header 1s(512) 1
Partition has no supported file system
3 Partition Table 2s(1024) 32
Partition has no supported file system
4 osx 40s(20480) 524360
File System Type Dectected TSK_FS_TYPE_HFS_DETECT
Directory: /
5 osxj 524400s(268492800) 524288
File System Type Dectected TSK_FS_TYPE_HFS_DETECT
Directory: /
6 osxcj 1048688s(536928256) 524288
File System Type Dectected TSK_FS_TYPE_HFS_DETECT
Directory: /
7 osxc 1572976s(805363712) 524144
File System Type Dectected TSK_FS_TYPE_HFS_DETECT
Directory: /
8 Unallocated 2097120s(1073725440) 34
Partition has no supported file system

There you go! We can now search, hash and extract from most things that come our way. In the next post we are going back to Windows and dealing with Volume Shadow Copies!

Get the code here: https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v11.py

Follow the github repo here: https://github.com/dlcowen/dfirwizard

↧

Automating DFIR - How to series on programming libtsk with python Part 13

May 25, 2015, 7:17 pm

≫ Next: Wait it's October?

≪ Previous: Automating DFIR - How to series on programming libtsk with python Part 12

Hello Reader,
This is part 13 of a many planned part series. I don't want to commit myself to listing all the planned parts as I feel that curses me to never finish. We've come a long way since the beginning of the series and in this part we solve one of the most persistent issues most of us have, getting easy access to volume shadow copies.

Now before we continue a reminder, don't start on this post! We've come a long way to get to this point and you should start at part 1 if you haven't already!

Part 1 - Accessing an image and printing the partition table
Part 2 - Extracting a file from an image
Part 3 - Extracting a file from a live system
Part 4 - Turning a python script into a windows executable
Part 5 - Auto escalating your python script to administrator
Part 6 - Accessing an E01 image and extracting files
Part 7 - Taking in command line options with argparse to specify an image
Part 8 - Hashing a file stored in a forensic image
Part 9 - Recursively hashing all the files in an image
Part 10 - Recursively searching for files and extracting them from an image
Part 11 - Recursively searching for files and extracting them from a live system
Part 12 - Recursively through multiple file system types

What you will need for this part:

You will need to download and install the pyvshadow MSI found here for Windows. This is the Python library that provides us with volume shadow access. It binds to the libvshadow library:
https://github.com/log2timeline/l2tbinaries/blob/master/win32/pyvshadow-20150106.1.win32-py2.7.msi

The vss helper library I found on the plaso project website:
https://github.com/dlcowen/dfirwizard/blob/master/vss.py

You will need to download the following sample image that has volume shadows on it:
https://mega.nz/#!LlRFjbZJ!s0k263ZqKSw_TBz_xOl1m11cs2RhIIDPoZUFt5FuBgc Note this is a 4GB rar file that when uncompressed will become a 25GB raw image

The Concept:

Volume shadow copes are different than any other file system we've deal with this far. The volume shadow subsystem wraps around a NTFS partition (I'm not sure how it will be implemented in REFS) and stores within itself a differential cluster based backup of changes that occur on the disk. This means that all views our tools give us of the data contained with the volume shadow copies are emulated file systems views based on the database of differential cluster map stored within the volume shadow system. This is what the libvshadow library does for us, it parses the differential database end creates a view of the data contained within it as a file system that can be parsed.

As with all things technical there is a catch. The base libvshadow was made to work with raw images only. There is of course a way around this, already coded into dfvfs, that I am testing for another part so we can access shadow copies from other types of forensic images.

As per Joachim Metz:

1. VSC is a large subsystem in Windows that can even store file copies on servers2. VSS (volsnap on-disk storage) is not a file system but a volume system it lies below NTFS. I opt to read: https://googledrive.com/host/0B3fBvzttpiiSZDZXRFVMdnZCeHc/Paper%20-%20Windowless%20Shadow%20Snapshots.pdf

Which means I am grossly oversimplifying things. In short I am describing how a volume system that exists within a NTFS volume is being interpreted as a file system for you by our forensic tools. Do not confuse VSS for actual complete volumes, they are differential cluster based records.

The Code:

The first thing we have to do is import the new library you just downloaded and installed as well as the helper class found on the Github.

import vss
import pyvshadow

Next since our code is NTFS specific we need to extend our last multi filesystem script to now to do something special if it detects that its accessing a NTFS partition:

print "File System Type Detected .",filesystemObject.info.ftype,"."
if (str(filesystemObject.info.ftype) == "TSK_FS_TYPE_NTFS_DETECT"):
print "NTFS DETECTED"

We do this as seen above by comparing the string "TSK_FS_TYPE_NTFS_DETECT" to the converted enumerated value contained in filesystemObject.info.ftype. If the values match then we are dealing with a NTFS partition that we can test to see if it has volume shadow copies.

To do the test are going to start using our new libraries:

volume = pyvshadow.volume()

offset=(partition.start*512)

fh = vss.VShadowVolume(args.imagefile, offset)

count = vss.GetVssStoreCount(args.imagefile, offset)

First we create a pyshadow object by calling the volume function constructor and storing the object in the variable named volume. Next we are declaring the offset again to the beginning of our detected NTFS partition. Next we are going to use our vss helper library for two different purposes. The first is to get a volume object that we can work with in getting access to the volume shadows stored within. When we call the VShadowVolume function in our helper vss class we pass it two arguments. The first is the name of the raw image we passed in to the program and the offset to the beginning of the NTFS partition stored within the image.

The second vss helper function we call is GetVssStoreCount which takes the same two arguments but does something very different. The function as the name implies returns the number of shadow copies that are present on the NTFS partition. The returned value starts at a count of 1 but the actual index of shadow copies starts at 0. Meaning whatever value is returned here we will have to treat as count -1 for our code. The other thing to know, based on my testing, you may get more volume shadows returned than are available on the native system. This is because when libvshadow parses the database it shows all available instances, including those volume shadow copies that have been deleted by the user or system but still partially exist within the database.

Not all NTFS partitions contain volume shadow copies so we need to test the count variable to see if it contains a result:

if (count):

vstore=0

volume.open_file_object(fh)

If it does contain a result (the function will set the count variable to undefined if there are no volume shadows on the partition) then we need to do two things first. The first is to keep track of which volume shadow we are working with which always begins with 0. Next we need to take our pyvshadow volume object and have it access the volume object named fh into the volume we created in the code above.

Once we have the pyvshadow volume object working with our vss helper class fh volume object we can start getting to the good stuff:

while (vstore < count):

store = volume.get_store(vstore)

img = vss.VShadowImgInfo(store)

We are using a while loop here to iterate through all of the available shadow copies using the n -1 logic we discussed before (meaning that there is starting from 0 count -1 copies to go through). Next we are going to get the pyvshadow volume object to return a view of the volume shadow copy we are working with (whatever value vstore is currently set to) from the volume_get_store function and store it in the 'store' variable.

We will then take the object stored in the 'store' variable and call the vss helper class function VShadowImgInfo which will return a imgInfo object that we can pass into pytsk and will work with our existing code:

vssfilesystemObject = pytsk3.FS_Info(img)

vssdirectoryObject = vssfilesystemObject.open_dir(path=dirPath)

print "Directory:","vss",str(vstore),dirPath

directoryRecurse(vssdirectoryObject,['vss',str(vstore)])

vstore = vstore + 1

So we are now working with our volume shadow copy as we would any other pytsk object. We are changing what we print now in the directory line to include which shadow copy we are working with. Next we are changing the parentPath variable we pass into the directoryRecurse function from [] or an empty list we used in our prior examples to a list which includes two members. The first member is the string 'vss' and the second is the string version of which shadow copy we are currently about to iterate and search. This is important so that we can uniquely export out each file that matches the search expression without overwriting the last file exported in prior shadow copy.

Lastly we need to increment the vstore value for the next round of the while loop.

Only one more thing to do. We need to search the actual file system on this NTFS partition and define an else condition to continue to handle all the other file systems that our libtsk version supports:

#Capture the live volume

directoryObject = filesystemObject.open_dir(path=dirPath)

print "Directory:",dirPath

directoryRecurse(directoryObject,[])

else:

directoryObject = filesystemObject.open_dir(path=dirPath)

print "Directory:",dirPath

directoryRecurse(directoryObject,[])

We don't have to change any other code! That's it! We now have a program that will search, hash and export like before but now has volume shadow access without any other programs, procedures or drivers.

Running the program against the sample image looks like this:

E:\development>python dfirwizard-v12.py -i image.dd -t raw -o rtf.cev -e -s .*r
tf
Search Term Provided .*rtf
Raw Type
0 Primary Table (#0) 0s(0) 1
Partition has no supported file system
1 Unallocated 0s(0) 2048
Partition has no supported file system
2 NTFS (0x07) 2048s(1048576) 204800
File System Type Dectected . TSK_FS_TYPE_NTFS_DETECT .
NTFS DETECTED
WARNING:root:Error while trying to read VSS information: pyvshadow_volume_open_f
ile_object: unable to open volume. libvshadow_io_handle_read_volume_header: inva
lid volume identifier.
3 NTFS (0x07) 206848s(105906176) 52219904
File System Type Dectected . TSK_FS_TYPE_NTFS_DETECT .
NTFS DETECTED
Directory: vss 0 /
Directory: vss 1 /
Directory: vss 2 /
Directory: /
4 Unallocated 52426752s(26842497024) 2048
Partition has no supported file system

The resulting exported data directory structure looks like this:

You can access the Github of all the code here: https://github.com/dlcowen/dfirwizard

and the code needed for this post here:

https://github.com/dlcowen/dfirwizard/blob/master/dfirwizard-v12.py

and

https://github.com/dlcowen/dfirwizard/blob/master/vss.py

Next part lets try to do this on a live system!

↧

Wait it's October?

October 29, 2015, 9:30 pm

≫ Next: Presenting ElasticHandler - OSDFCon 2015 - Part 1 of 3

≪ Previous: Automating DFIR - How to series on programming libtsk with python Part 13

Hello Reader,
It's been 5 months since I last updated the blog and I have so much to share with you! I'm writing this to force myself to follow up with some posts tomorrow that will include:

Our slides from OSDFCon
An overview of our new open source project we put on GitHub ElasticHandler
An overview of our other new open source project on GitHub GCLinkParser
All the video's of the forensic lunch, they've been on youtube though!
An update on the forensic lunch podcast version
and more!

Talk to you tomorrow!

↧

Presenting ElasticHandler - OSDFCon 2015 - Part 1 of 3

October 30, 2015, 11:53 am

≫ Next: Forensic Lunch Podcast Version

≪ Previous: Wait it's October?

Hello Reader,
It's been too long, there are times I miss the daily blogging where I made myself do this daily. Other times I'm just ready to go to sleep and happy to not have the obligation on me. But 5 months without sharing what we are doing is far to long.

Today let me introduce you to what I hope will be a new tool in your arsenal, we call it Elastic Handler. Why? Well it's well known that we are bad at naming things at G-C Partners. We are bad at making cool names involving animals or clever acronyms. To make up for this we work really hard to make useful things. I think you'll be pleasantly surprised to see how much this could help your day to day DFIR life.

What you will need:

1. Python installed, hopefully if you followed the blog at all this year you have that now
2. Our code from Github: https://github.com/devgc/ElasticHandler
3. Elastic Search.

If you don't already have Elastic Search (ES) running or you don't want to dedicate the time to do so, then you and I share a lot in common! When I was first testing this with Matt I didn't want to use the server we had setup internally as that wouldn't be realistic for those just trying to do this for the first time. That's when I remembered this amazing website that has packaged up services like ES and turned them into pre-configured and deployable instances. This website is called BitNami.

So let's say after reading this post you realize you want to use this neat new tool we've put out (it's open source btw!). Then you can just run the ES installer from Bitnami found here: https://bitnami.com/stack/elasticsearch/installer and available for Windows, OSX and Linux!

Once you run through their install wizard everything is configured and ready to go! It really is that simple and it just keeps taking me by surprise how easy it is each time I use one of their packages.

What does it do?

You have what you need to use our new new new thing, but what does it do? Elastic Handler allows you to take all the tools you currently run and make their output better by doing the following:

1. It allows you to define a mapping from the CSV/TSV/Etc.. output you are getting now into JSON to that ES can ingest it
2. It allows you to normalize all of the data you are feeding in so that your column names are suddenly the same allowing cross reporting searching and correlation.
3. It lets you harness the extreme speed of Elastic Search to do the heavy listing in your correlation
4. It lets you take a unified view of your data/report to automate that analysis and create spreadsheets/etc.. that you would have spent a day on previously

The idea here is to let computers to what can be automated so you can spend more time using what makes humans special. What I mean by that is your analyst brains and experience to spot patterns, trends and meaning from the output of the reports.

In the GitHub repository we put out there are several mappings provided for the tools we call out to the most like; Tzworks tools, Shellbag explorer, our link parser, Mandiant's Shimcache parser, etc.. But the cool thing about this framework is that to bring in another report all you have to do is generate two text files to define your mapping, there is no code involved.

How does it work?

To run ElasticHandler you have two choices.

Option 1 - One report at a time
"elastichandler.py --host --index --config --report

Option 2 - Batching reports
Included in the github repository is a batchfile that just shows how to run this against multiple reports at one time. The batch file is called 'index_reports.bat' you just need to add a line for each report you want to bring in.

Let's go into detail on what each option does.

--host: This is the IP Address/Hostname of the ES server. If you installed it on your own system you can just reference 127.0.0.1 or localhost

--index This is the name of the index in Elastic Search where all of this batch of reports data will be stored. If this index does not exist yet, thats ok! ES will just make it for you and store the data there as long as that index is specified.

Think of index's as case's in FTK or Encase. All of the reports related to one particular investigation should go into one 'index' or 'case'. So if you had reports for one person's image you could make an index called 'bob' and then if you later bring in a new case involving max you would then specify 'max' as the index or all your new reports.

Index's here are just how you are going to logically store and group together the reports into your ES index. In the end they are all stored on the ES server and you can always specify which one to insert data into or query data from.

--config This is where the magic happens.The config file defines how the columns in your report output will be mapped into Json data for Elastic Search to store. Elastic Searches one big requirement is that all data that will be ingested has to be in JSON format. The good news is that you all you have to do is create a text file that defines these columns and ES will take it without having to pre-create a table or file within it to properly store the data. ES will store just about anything, its up to you to let it know what its dealing with so you can search it properly.

This leads to the mapping file which the config file references that will tell ES how to treat each column. Is it text a date, etc... I'll write up more on how to create this config files in the next post.

-- report This is tool output you want to bring into ES for future correlation and searching. Remember that you have to have a mapping defined in the config file above for the tool before you can successfully turn that CSV/TSV/Etc.. into something ES can take

If you just want to test out ES Handler we have sample reports ready to parse and config files for them ready to go in the GitHub.

How does it make my DFIR life better?

Once you have data ingested you can view it through ES's web interface (not the greatest) or Kibana (a visualization tool and good for timeline filtering) . If you really want to take full advantage of what you've just done (normalize, index and store) you need some program logic to correlate all that data into something new!

I've written one such automagic correlation for mapping out what usb drives have been plugged in and whats on each of the volumes plugged in via lnk files and jump lists into a xlsx file. You can find the script in our GitHub repo under scripts and its called . I've actually fully commented this code so you can follow along and read it but I'll also be writing a post explaining how to create your own.

I do plan to write more of these correlation examples but let me show what the output looks like based on the sample data we provide in the GitHub repo:

First tab the USB devices found in the image from TZWorks USP tool:

The other tabs then contain what we know is on each volume based on the volume serial number recorded within the lnk files and jump lists:

And all of this takes about 1 second to run and create an xslx file with freeze pane applied to the first two rows for scrolling, filters applied to all the columns for easy viewing and more!

You can make the report output fancier and if you wanted to do so I would recommend looking at Ryan Benson's code for writing out xlsx files using xlsxwriter found in HindSight: https://github.com/obsidianforensics/hindsight/blob/master/hindsight.py
which is where I found the best example of doing it.

What's next?

Try out Elastic Handler, see if you like what it does and look to see what tools we are missing mappings for. You can ingest anything you can get a report output for and you can correlate anything you can find within it. If you want more background you can see our slides from OSDFCon here:

https://www.dropbox.com/s/egz8pwn38ytcf28/OSDFCon2015%281%29.pptx?dl=0

and watch the blog as I supplement this with two more posts regarding how to make your own mapping files and how to create your own correlation scripts.

↧

Forensic Lunch Podcast Version

November 2, 2015, 7:07 pm

≫ Next: How to install dfVFS on Windows without compiling

≪ Previous: Presenting ElasticHandler - OSDFCon 2015 - Part 1 of 3

Hello Reader,
Did you know we have a podcast version of the forensic lunch?

Will we do!

You can directly subscribe to it here: http://forensiclunch.libsyn.com/

or you can get on iTunes https://itunes.apple.com/us/podcast/forensic-lunch-david-cowen/id1035370378?mt=2, Doubletwist and other fine podcast directories.

All the past and current episodes are up there and I try to get new episodes converted from Youtube videos to MP3 within a couple days of the broadcast or sooner.

If you want to watch the videocast (I think that's the right word) they are all on Youtube here: https://www.youtube.com/user/LearnForensics

Lastly for this post the next episode of the Forensic Lunch is 11/13/15 with:

Andrew Case,@attrc, from the Volatility Project talking about Volatility 2.5, new plugins and the winners of this years Volatility Plugin Contest
Yogesh Kahtri, from Champlain, talking about SRUM forensics in Windows 8.1+. A truly amazing new artifact
Matt and I talking about our new open source tool Elastic Handler

You can RSVP to watch it live here: https://plus.google.com/b/105962155502598586194/events/cn8j9sfiojcbjo03uju2ovqruo0

You can watch it on Youtube here: https://www.youtube.com/watch?v=5qUZqjltHmU

Expect more posts in this space through the end of the year!

↧

How to install dfVFS on Windows without compiling

December 11, 2015, 3:29 pm

≫ Next: Tool Testing: File Carvers as seen on the Forensic Lunch 3/11/16

≪ Previous: Forensic Lunch Podcast Version

Hello Reader,

After some testing James Alwood in our office was able to get dfVFS installed using just MSI packages from the log2timeline project. This is a big step for us as it severly lowers the bar of who can take advantage of all the advanced capabilities that dfVFS provides versus modules like pytsk on their own. In this first post in the dfVFS series James has written a walk through on how to do install it and make sure it works.

Digital Forensics Virtual File System (dfVFS) is a flexible file system that allows its users to access many different kinds of image formats using Python. Getting it up and running can be difficult and confusing. This guide will help you install everything you need to get started using dfVFS on Windows. I am assuming you already have 32 bit Python 2.7.9 or .10 installed.

Dependencies

dfVFS has a lot of dependencies, fortunately most of them packaged together and made available for log2timeline.

1) Go to https://github.com/log2timeline/l2tbinaries

2) Click on the “Download ZIP” button on the right side of the screen

3) Unzip the file

4) Go to the win32 folder

5) Install the msi’s for the following packages into your Python installation (name is before version number):

six

construct

protobuf

pytsk3

pybde

pyewf

pyqcow

pysigscan

pysmdev

pysmraw

pyvhdi

pyvmdk

pyvshadow

6) Open a Python session in IDLE or from a command prompt

7) Enter

>>>import sqlite3

>>>sqlite3.sqlite_version

8) If the output is below version 3.7.8 (fresh Python installs seem to have 3.6.21) go to https://www.sqlite.org/download.html

9) Download sqlite-dll-win32-x86-3081002.zip and unzip the file

10) In C:\Python27\DLLs back up the sqlite3.dll and then copy the new sqlite3.dll you just downloaded to this folder

11) If you haven’t already, end your Python session then restart it. Run the lines from Step 7 again. You should now be on a more current version of sqlite3

12) All of your dependencies should now be installed

Installing dfVFS

With all the dependencies installed we can now get the dfVFS module installed.

1) Go back into the win32 folder in the log2timelines archive from step 4 in part 1

2) Run the dfvfs msi located inside the win32 folder

3) When it is complete dfVFS will be added to your Python installation and ready for use

4) If you would like to test the installation to verify it was successful download the dfVFS zip available on Github: https://github.com/log2timeline/dfvfs

5) Unzip the archive. DO NOT use the extract option included in Windows, it makes minor modifications to some of the test files while unpacking them and will cause tests to fail.

6) Open a command prompt and cd into the directory where you unpacked the dfVFS archive

7) Run python run_tests.py from this directory to verify that dfVFS was properly installed

8) If you rename or delete the dfvfs folder in your unzipped dfVFS archive and rerun run_tests.py the tester will run dfVFS from the installed package in your Python directory

↧

Tool Testing: File Carvers as seen on the Forensic Lunch 3/11/16

March 15, 2016, 3:02 pm

≫ Next: Daily Blog #366: The return to Daily Blogging and pytsk vs dfvfs

≪ Previous: How to install dfVFS on Windows without compiling

Greetings Readers,

This post documents the findings we discussed on the Forensic Lunch 3/11/16 episode where we discussed how different file carvers worked (watch it here).I asked James to write up the results of our test and provide links to our test images.

On Friday’s (3/11/16) Forensic Lunch we discussed the results of testing several different file carvers. Using a copy of a Windows 7 VHD we keep on hand for testing purposes we decided to find out how well each of the tools recovered some known data. The VHD contained 10 pdfs, 8 Word docs, 10 prefetch, 7 sqlite DB’s and 10 event logs were copied to the My Documents directory in the VM then shift-deleted in groups of 5. In addition Office was installed and a Word Document was created and then 10 versions made bringing the total of Word Documents deleted to 18. Since some programs that do file carving are file system aware a copy of the VHD was made were the MFT was overwritten with nulls using a hex editor. The carvers tested were X-Ways, Blade, Bulk Extractor, and Blacklight. (Note if you would like to see a different carver tested put it in the comments below)

X-Ways and Blacklight are both tools that are file system aware. Meaning that when an image was carved these tools would traverse the MFT and display deleted files it could identify in the directory they had been deleted from if they were still being referenced and their contents were not overwritten. This means that the tools would not include these recovered files within their ranges of unallocated space to be carved as they know the contents of those files is still valid.

In testing, Blacklight carved and validated more pdf’s and docx’s, while X-Ways provided support for carved prefetch and event logs where Blacklight currently does not have an option for these file types. The version of the same image with no MFT provided similar numbers of documents carved for X-Ways and Blacklight with Blacklight giving fewer false positives on xlsx and pptx files as they have added in a validation step on carved data.

Blade and Bulk Extractor do not check for a file system and just start scanning through data. Of all the carvers Blade by far carved the most documents. By default Blade does not have definitions for Prefetch or event logs. Additionally Blade does not identify what type of Office document and makes a copy of what it finds as a .pptx, .docx, and .xlsx. This may just be a bug in the version of Blade we are using, we need to try the newly released version.

Bulk Extractor does not look for 2007 or greater office documents (docx, pptx, etc..) specifically, they are included within the option for zip files. On this test set it found 24,000 zip archives. The word documents we deleted would be in here, but would require a second step to locate them. Possibly with a tool like exiftool to identify those zip files with Office metadata. Bulk Extractor did not find any pdf’s from its pdf signature but was able to find the same number of prefetch files as X-Ways. It also found and carved a similar number of SQLite DB’s as X-Ways and Blacklight did when they were given the image with a destroyed MFT.

We also ran the carvers against the CFREDs L0 Documents test set. This test set had 7 documents. X-Ways and Blacklight had not trouble and got all 7 files correct. Blade got 6 of the 7, missing one docx in the docx folder. Bulk Extractor put what it found in the zip file listing and missed the pdfs.

In terms of speed Bulk Extractor was the fastest on our test set, finishing its run in 2 minutes and 1 second against a 50gb image. Next was X-Ways with and without the MFT at 2:13 and 2:25. Blacklight with the MFT ran in 8:02. Blade ran in 9:10 and finally Blacklight without an MFT took 16:54. Blacklight could not use the 12.5 GB compressed version of the 50GB VHD so we converted it into a full 50 GB raw image . The CFREDs set is so small all the carvers got through it in a few seconds.

Overall X-Ways out of the box supported all the file types we were interested in. For situations where default options are not enough Blade tends to be the most flexible for writing new definitions, allowing you to write detailed definitions. Bulk Extractor’s filters were not as well suited for this test, but it provides options that are of use in IR scenarios and it is the only one of the four available for free. Blacklight gave fewer false positives in the no MFT set and was the only tool we are aware of that we tested that will attempt to test the results of the carver for valid data.

Want to do your own testing? Want to verify our results? You can! click here for the test set with an MFT and result spreadsheet and here for the test set where the MFT has been nuked. The CFREDs images can be founds here: http://www.cfreds.nist.gov/FileCarving/index.html

↧

Daily Blog #366: The return to Daily Blogging and pytsk vs dfvfs

April 6, 2016, 8:36 am

≫ Next: Daily Blog #367: Automating DFIR with dfVFS part 1

≪ Previous: Tool Testing: File Carvers as seen on the Forensic Lunch 3/11/16

Hello Reader,
As crazy as it sounds, I've missed doing daily blogs. It forced me to keep looking, reading and thinking about new things to write about and do. The forensic lunch podcast is still going strong and is not going away but that is more me leaning on others in the community to talk about what they are doing and less about forcing myself to document and share my own research.

So with that in mind, let's set our schedule for this blog.

Sunday - Sunday Funday's return, prepare yourself for more forensic fun and real prizes
Monday - Sunday Funday results
Tuesday - Daily Blog entry
Wednesday - Daily Blog entry
Thursday - Daily Blog entry
Friday - Either Forensic Lunch or a video tutorial depending on the broadcast schedule
Saturday - Saturday reading will return

This year you can expect more blogs about new artifacts, old artifacts, triforce, journal forensics, python programming for DFIR and more.

If you want to show your support for my efforts, there is an easy way to do that.

Vote for me for Digital Forensic Investigator of the Year here: https://forensic4cast.com/forensic-4cast-awards/

Otherwise, get involved! Leave comments, tell your friends about the blog/podcast, send me a tweet, drop me an email (dcowen@g-cpartners.com) it's always more fun when we all talk and work together. Windows 10 is out, OSX keeps getting updated with new features, Ubuntu is running on Windows, iOS and Android keep getting more interesting and so much more is out there to be researched!

So with it being Wednesday let's get into our first topic which leads into my next planned blog posts.

PYTSK v DFVFS

If you read the blog last year you would have seen a series of blogs under the series title, Automating DFIR. If you noticed, I stopped after part 13 and haven't continued the series since. There is a reason for this and the reason is not because I got tired of writing about it. Instead I hit the wall that required us to use DFVFS in Triforce; Shadow Copies and E01s. Libvshadow is an amazing library but as a standalone library it requires a raw disk image or a live disk, it does not support other forensic image formats directly.

I looked into ways around this by reading the Plaso code and seeing what glue they were using to shape the object and the super classes in such a way that the libewf object would work with libvshadow but I realized in doing so that I was just creating more problems for myself that were already solved. DFVFS (Digital Forensics Virtual Filesystem) was created to solve all the known issues with all the different image formats and libraries that need to access them as framework and wrapper that allow all of these things to work together. Now DFVFS is more than just shadow access in E01s, it provides a wrapper around all of the forensic image and virtual disk formats that Metz's libraries support in a way that means you can write one piece of code to load in any disk type rather than writing 5 functions to deal with each image format and it's care and feeding.

I initially was worried about using DFVFS in the blog because of the effort that it appears to take to get it up and going. However, with 13 blog posts already out there showing how to make pytsk work for simple solutions I think it's time to switch gears and libraries to allow us to accomplish more interesting and complicated tasks together with DFVFS directly.

So with that in mind your homework dear reader is to read this post: http://www.hecfblog.com/2015/12/how-to-install-dfvfs-on-windows-without.html and be prepared for tomorrows first post showing how to work with this amazing collection of libraries.

↧

Daily Blog #367: Automating DFIR with dfVFS part 1

April 7, 2016, 4:08 pm

≫ Next: Daily Blog #368: Forensic Lunch 4/8/16 with Jared Atkinson talking about Forensics with Powershell

≪ Previous: Daily Blog #366: The return to Daily Blogging and pytsk vs dfvfs

Hello Reader,
Today we begin again with a new Automating DFIR series.

If you want to show your support for my efforts, there is an easy way to do that.

Vote for me for Digital Forensic Investigator of the Year here: https://forensic4cast.com/forensic-4cast-awards/

The last time we started this series (you can read that here http://www.hecfblog.com/2015/02/automating-dfir-how-to-series-on.html) we were using the pytsk library primarily to access images and live systems. This time around we are going to restart the series to the first steps to show how to do this with the dfVFS library which makes use of pytsk and many, many other libraries.

In comparison to dfVFS pytsk is a pretty simple and straightforward library, but it does have its limitations. dfVFS (Digital Forensics Virtual Filesystem) is not just one library, its a collection of DFIR libraries with all the glue in between so things work together without reinventing the wheel/processor/image format again. This post will start with opening a forensic image and printing the partition table much like we did in part 1 of the original Automating DFIR with pytsk series. What is different is that this time our code will work with E01s, S01s, AFF and other image formats without us having to write additional code for it. This is because dfVFS has all sorts of helper functions built in to determine the image format and load the right library for you to access the underlying data.

Before you get started on this series make sure you have the python 2.7 x86 installed and have followed the steps in the following updated blog post about how to get dfVFS setup:
http://www.hecfblog.com/2015/12/how-to-install-dfvfs-on-windows-without.html

You'll also want to download our first forensic image we are working with located here:
https://mega.nz/#!ShhFSLjY!RawTMjJoR6mJgn4P0sQAdzU5XOedR6ianFRcY_xxvwY

When I got my new machine setup I realized that a couple new libraries were not included in the original post so I updated it. If you followed the post to get your environment setup before yesterday you should check the list of modules to make sure you have them all installed. Second on my system I had an interesting issue where the libcrypto library was being installed as crypto but dfVFS was calling it as Crypto (case matters). I had to rename the directory under \python27\lib\site-packages\crypto to Crypto and then everything worked.

If you want to make sure everything works then download the full dfvfs package from github (linked in the installing dfvfs post) and run the tests before proceeding any further.

Let's start with what the code looks like:

import sys
import logging

from dfvfs.analyzer import analyzer
from dfvfs.lib import definitions
from dfvfs.path import factory as path_spec_factory
from dfvfs.volume import tsk_volume_system

source_path="stage2.vhd"

path_spec = path_spec_factory.Factory.NewPathSpec(
          definitions.TYPE_INDICATOR_OS, location=source_path)

type_indicators = analyzer.Analyzer.GetStorageMediaImageTypeIndicators(
          path_spec)

source_path_spec = path_spec_factory.Factory.NewPathSpec(
            type_indicators[0], parent=path_spec)

volume_system_path_spec = path_spec_factory.Factory.NewPathSpec(
        definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/',
        parent=source_path_spec)

volume_system = tsk_volume_system.TSKVolumeSystem()
volume_system.Open(volume_system_path_spec)

volume_identifiers = []
for volume in volume_system.volumes:
  volume_identifier = getattr(volume, 'identifier', None)
  if volume_identifier:
    volume_identifiers.append(volume_identifier)


print(u'The following partitions were found:')
print(u'Identifier\tOffset\t\t\tSize')

for volume_identifier in sorted(volume_identifiers):
  volume = volume_system.GetVolumeByIdentifier(volume_identifier)
  if not volume:
    raise RuntimeError(
        u'Volume missing for identifier: {0:s}.'.format(volume_identifier))

  volume_extent = volume.extents[0]
  print(
      u'{0:s}\t\t{1:d} (0x{1:08x})\t{2:d}'.format(
          volume.identifier, volume_extent.offset, volume_extent.size))

print(u'')

as you can tell this is much larger than our first code example from the pytsk series which was this:

import sys
import pytsk3
imagefile ="Stage2.vhd"
imagehandle = pytsk3.Img_Info(imagefile)
partitionTable = pytsk3.Volume_Info(imagehandle)
for partition in partitionTable:
print partition.addr, partition.desc, "%ss(%s)"% (partition.start, partition.start *512), partition.len

But the easier to read and smaller pytsk example is much more limited in functionality to what the dfVFS version can do. On its own our pytsk example could only work with raw images, our dfVFS example can work with multiple image types and already has built in support for multipart images and shadow copies!

Let's break down the code:

import sysimport logging

Here we are just importing in two standard python libraries. Sys for default python system library and logging which gives a mechanism to standardized our logging of errors and information messages that we can tweak so we can give different levels of information based on what level of logging is being requested.

Next we are bringing in multiple dfVFS functions:

from dfvfs.analyzer import analyzer

We are bringing in 4 helper functions here from dfVFS. First we are bringing in the

analyzer function which can determine for us the type of image or archive or partition we

are attempting to access, in its current version it can auto detect the following:

bde - bitlockered volumes

bzip2 - bzip2 archives

cpio - cpio archives

ewf - expert witness format aka e01 images

gzip - gzip archives

lvm - logical volume management, the linux partitioning system

ntfs - ntfs partitions

qcow - qcow images

tar - tar archives

tsk - tsk supported image types

tak partition - tsk identified partitions

vhd - vhd virtual drives

vmdk - vmware virtual drives

shadow volume - windows shadow volumes

zip - zip archives


from dfvfs.lib import definitions

The next helper function is the definitions function which maps our named types and

identifiers to the values that the underlying libraries are expecting or returning.


from dfvfs.path import factory as path_spec_factory

The path_spec_factory helper library is one of the cornerstones of understanding the

dfVFS framework. path_spec's are what you pass into most of the dfvfs functions and contain

the type of object you are passing in (from the definitions helper library), the location

where this thing is (either on the file system you are running the script from or the

location within the image you are pointing to) and the parent path if there is one. As we

go through this code you'll notice we make multiple path_spec's as we work to make the

object we need to pass to the right helper function to access the forensic image.

         from dfvfs.volume import tsk_volume_system

This helper library is create a pytsk volume object for us to allow us to use pytsk to enumerate and access volumes/partitions.

source_path="stage2.vhd"

Here we are creating a variable called source_path and storing within it the name of the forensic
image we would like to work with. In future examples we will work with other images types and
sizes but this is a small and simple image. I've tested this script with vhds and e01s and both opened
without issue and without changing any code other than the name of the file.

path_spec = path_spec_factory.Factory.NewPathSpec(
definitions.TYPE_INDICATOR_OS, location=source_path)

Our first path_spec object. Here we are calling the path_spec_factory helper function and using the

NewPathSpec function to return a path spec object. We are passing in the type of file we are working

with which is TYPE_INDICATOR_OS which is defined in the dfVFS wiki as a file contained within

the operating system and where the file is located as the location by passing in the source_path

variable we made in the line above.

type_indicators = analyzer.Analyzer.GetStorageMediaImageTypeIndicators(
path_spec)

Next we are letting the Analyzer helper function's Get Storage Media Image Type Indicator

function to let it figure out what kind of image, file system, partition or archive

we are dealing with by its signature. It will return back the type into a variable called

type indicators.

source_path_spec = path_spec_factory.Factory.NewPathSpec( type_indicators[0], parent=path_spec)

Once we have the type of thing we are working with we want to generate another path_spec

object that has that information within it so our next helper library knows what it is

dealing with. We do that by calling the same NewPathSpec function but now for a type

we are passing in the first result that was stored in the type_indicator. Now I am cheating

a bit here to make things simple, I should be checking to see how many types are being

returned and if we know what we are dealing with. However that won't make this program

any easier to read and I'm giving you an image that will work correctly with this code.

In future blog posts we will put in the logic to detect and report such errors.

volume_system_path_spec = path_spec_factory.Factory.NewPathSpec(
definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/',
parent=source_path_spec)

Another path_spec object! Now that we have a path_spec object that identifies its type as

a forensic image we can create a path_spec object for the partitions contained within it.

You can see we are passing in the type of TSK_PARTITION meaning this is a tsk object

that will work with the tsk volume functions. Again in future posts we will write code to

determine if there are in fact partitions here to deal with, ,but for now just know it

works.

volume_system = tsk_volume_system.TSKVolumeSystem()volume_system.Open(volume_system_path_spec)

Now we are going back to our old buddy libtsk aka pytsk. We are creating a TSK Volume

object and storing it in volume_system. Then we are opening the path_spec object we just

made that contains a valid tsk object and passing that into our new tsk volume object and

telling it to open it.

volume_identifiers = []
for volume in volume_system.volumes:
volume_identifier = getattr(volume, 'identifier', None)
if volume_identifier:
volume_identifiers.append(volume_identifier)

Here we are initializing a list object called volume_identifiers and then making use of the volumes

function within the tsk volume_system object to return a list of volumes aka partitions stored within

the tsk volume object we just opened. Our for loop will then iterate through each volume returned

and for each volume it will grab the identifier attribute from the volume object created in the for loop

and store the result in the volume_identifier variable.

Our last line of code is checking to see if a volume_identifier was returned, if it was then we will append it to our list of volume_identifiers we initialized prior to our for loop.

print(u'The following partitions were found:')print(u'Identifier\tOffset\t\t\tSize')for volume_identifier in sorted(volume_identifiers): volume = volume_system.GetVolumeByIdentifier(volume_identifier) if not volume: raise RuntimeError( u'Volume missing for identifier: {0:s}.'.format(volume_identifier)) volume_extent = volume.extents[0] print( u'{0:s}\t\t{1:d} (0x{1:08x})\t{2:d}'.format( volume.identifier, volume_extent.offset, volume_extent.size))print(u'')

In this last bit of code we are printing out the information we know so far about these

partitions/volumes. We do that by doing a for loop over our volume_indetifiers list.

For each volume stored within it we are calling the GetVolumeByIdentifier function and

storing the returned object in the volume variable.

We then print three properties from the volume object returned, the identifier

(the partition or volume number), the offset to where the volume begins (in decimal

and hex) and lastly how large the volume is.

Woo, that's it! I know that is a lot to go through for an introduction post but it all

build on this and within a few posts you will really begin to understand the power of

dfVFS.

You can download this python script from the github here:

https://github.com/dlcowen/dfirwizard/blob/master/dfvfsWizardv1.py

↧

Daily Blog #368: Forensic Lunch 4/8/16 with Jared Atkinson talking about Forensics with Powershell

April 8, 2016, 12:38 pm

≫ Next: Daily Blog #369: Saturday Reading 4/9/16

≪ Previous: Daily Blog #367: Automating DFIR with dfVFS part 1

Hello Reader,
What a great Forensic Lunch today with Jared Atkinson talking all about how to do forensics on a live system or mounted image with his Powershell framework PowerForensics.

You can grab your own copy of PowerForensics on Github here:
https://github.com/Invoke-IR/PowerForensics

Read his Blog here:
www.invoke-ir.com

Vote for him in the Forensic4Cast Awards here:
https://forensic4cast.com/forensic-4cast-awards/
Reminder I'm up for voting in another category as well!

and of course you can follow him on Twitter here:
https://twitter.com/jaredcatkinson

Btw, if you want to learn Windows Forensic with me I'm schedule to teach SANS FOR408 Windows Forensics in Houston May 9-14. You can find out more here:
https://www.sans.org/event/houston-2016/course/windows-forensic-analysis

You can watch the episode on youtube here:
https://www.youtube.com/watch?v=uCffFc4r4-k

It's also on iTunes or you can just watch it below:

↧

Daily Blog #369: Saturday Reading 4/9/16

April 9, 2016, 6:12 pm

≫ Next: Daily Blog #370: Sunday Funday 4/10/16

≪ Previous: Daily Blog #368: Forensic Lunch 4/8/16 with Jared Atkinson talking about Forensics with Powershell

Hello Reader,

It's Saturday! I'm excited to post my first Saturday Reading in almost two years!. While I get to work on seeing whats changed in the world of rss feeds and twitter tags since I last did this, here is this weeks Saturday Reading!

1. We had a great forensic lunch this week. We had Jared Atkinson talking all about how to do forensics on a live system or mounted image with his Powershell framework PowerForensics.

You can grab your own copy of PowerForensics on Github here: https://github.com/Invoke-IR/PowerForensics
Read his Blog here: www.invoke-ir.com
Vote for him in the Forensic4Cast Awards here: https://forensic4cast.com/forensic-4cast-awards/

Reminder I'm up for voting in another category as well!

and of course you can follow him on Twitter here: https://twitter.com/jaredcatkinson

You can watch the episode on youtube here: https://www.youtube.com/watch?v=uCffFc4r4-k

2. Adam over at Hexacorn is continuing to update his tool DeXRAY which can examine, extract and detail information about the malware that 20 different anti virus products. If you've ever been frustrated that the very thing you need to analyze is being withheld by an anti virus products quarantine this should help.

Read more about it and download it here: http://www.hexacorn.com/blog/2016/04/06/dexray-twentin-quarantino/

3. On the CYB3RCRIM3 blog there is a neat post covering the basic facts and a judges ultimate opinion regarding a civil case that involved the Computer Fraud and Abuse Act (CFAA). While there are alot of criminal cases out there that have CFAA charges there are few civil CFAA cases that I know of, outside of the ones I've been involved in.

4. Harlan has a new post up on his blog Windows Incident Response. It covers some new WMI persistence techniques he's seen used by attackers in the wild. Not only does Harlan link to a blog he wrote for SecureWorks on the topic but he also linked to a presentation written by Matt Graeber from Mandiant.

5. Also on Harlan's Blog he's let us know that the 2nd version of Windows Registry Forensic is out!

Read more about here and get a copy for yourself! http://windowsir.blogspot.com/2016/04/windows-registry-forensics-2e.html

6. The 2016 Volatility Plugin Contest is live! If you have an idea or just want to go through the learning process of how to write a Volatility plugin for cash and prizes you should go here: http://volatility-labs.blogspot.com/2016/04/the-2016-volatility-plugin-contest-is.html

Did I miss something? Let me know in the comments below!

↧

Daily Blog #370: Sunday Funday 4/10/16

April 9, 2016, 8:15 pm

≫ Next: Daily Blog #371: Sunday Funday 4/10/16 Winner!

≪ Previous: Daily Blog #369: Saturday Reading 4/9/16

Hello Reader,
If you watched the Forensic Lunch Friday you would have heard us talking to Jared Atkinson about PowerForensics, his DFIR framework all written in Power Shell. Let's see what your determination of its forensic soundness is in this weeks Sunday Funday challenge.

The Prize:
$200 Amazon Giftcard

The Rules:

You must post your answer before Monday 4/11/16 3PM CST (GMT -5)
The most complete answer wins
You are allowed to edit your answer after posting
If two answers are too similar for one to win, the one with the earlier posting time wins
Be specific and be thoughtful
Anonymous entries are allowed, please email them to dcowen@g-cpartners.com. Please state in your email if you would like to be anonymous or not if you win.
In order for an anonymous winner to receive a prize they must give their name to me, but i will not release it in a blog post

The Challenge:
The term Forensically Sound has a lot of vagueness to it. Let's get rid of the ambiguity regarding what changes when you run the PowerForensics powershell script to extract the mft from a system. Explain what changes and what doesn't from executing the powershell script to extracting the file.

↧

Daily Blog #371: Sunday Funday 4/10/16 Winner!

April 11, 2016, 1:15 pm

≫ Next: Daily Blog #372: Automating DFIR with dfVFS part 2

≪ Previous: Daily Blog #370: Sunday Funday 4/10/16

Hello Reader,
Another challenge has been answered by you the readership. This week our anonymous winner claims a $200 Amazon Gift card for showing what the impact of installing and running PowerForensics is. You too can join the ranks of Sunday Funday winners and I think I'm going to do something special for all past and future winners so everyone can know of your deeds.

The Challenge:

The term Forensically Sound has a lot of vagueness to it. Let's get rid of the ambiguity regarding what changes when you run the PowerForensics powershell script to extract the mft from a system. Explain what changes and what doesn't from executing the powershell script to extracting the file.

The Winning Answer:

Anonymous Submission

This answer is based on the assumption that you are not connecting to the target system via F-Response or a similar method and that you are running the PowerForensics PowerShell script directly on the target system. This also assumes that the PowerForensics module is already installed on the system.

When the powershell script is executed, program execution artifacts associated with PowerShell will be created. These artifacts include the creation of a prefetch file (if application prefetching is enabled), a record in the application compatibility cache (the exact location/structure of which depends on the version of Windows installed), a record in the MUICache, and possibly a UserAssist entry (if the script was double-clicked in Explorer). In addition, event log records may be created in the Security event log if process tracking is enabled.

Installing the PowerForensics powershell module will result in different artifacts depending on the version of Powershell installed on the target system. If the Windows Management Framework version 5 is not installed on the target system, the PowerForensics module can be installed by copying the module files to a directory in the PSModulePath. Using this method will result in the creation of new files in a directory on the target system, which brings with it the file creation artifacts found in NTFS (e.g. $MFT record creation, USNJrnl record creations, parent directory $I30 updates, changes to the $BITMAP file, etc.). If the Windows Management Framework version 5 is installed, the Install-Module cmdlet can be used to install. This may require the installation of additional cmdlets in order to download/install the PowerForensics module, which would result in additional files and directories being created in a directory in the PSModulePath.

Since the script uses raw disk reads to determine the location of the $MFT on disk, it should not impact the $STANDARD_INFORMATION or $FILE_NAME timestamps of the files being copied.

↧

Daily Blog #372: Automating DFIR with dfVFS part 2

April 12, 2016, 8:04 pm

≫ Next: Daily Blog #373: Automating DFIR with dfVFS part 3

≪ Previous: Daily Blog #371: Sunday Funday 4/10/16 Winner!

Hello Reader,
In this short post I want to get more into the idea of the path specification object we made in the prior part. If this post had a catch title it would be zen and the art of path specification.

In the prior post, part 1 of the series, we made three path specification objects. I described path specification objects as the corner stone in understand dfVFS which I believe to be true. What I didn't point out is that the path specification objects in that first example code where building on top of themselves like a layer cake.

Let's take a look at the three objects we created again.

path_spec = path_spec_factory.Factory.NewPathSpec(
          definitions.TYPE_INDICATOR_OS, location=source_path)

source_path_spec = path_spec_factory.Factory.NewPathSpec(
type_indicators[0], parent=path_spec)

volume_system_path_spec = path_spec_factory.Factory.NewPathSpec(
definitions.TYPE_INDICATOR_TSK_PARTITION, location=u'/',
parent=source_path_spec)

If you were to look carefully you would notice there are a couple of differences between
the calls to the NewPathSpec function.

1. The type of path specification we are making is changing. We start with a operating system
file, then an image (which is being set by the return of our indicator query) and lastly
we are working with a partition.
2. Two of our path specifications declare a location, one does not
3. Most importantly, source_path_spec and volume_System_path_spec have parents. Those parents
are the path specification objects created prior.

So if you were to look at it like one single object with multiple layers it would look
something like this.

---------------------------
| OS File Path Spec |
---------------------------
| TSK Image Type Path Spec |
----------------------------
| TSK Partition Path Spec |
----------------------------

The lowest layer in the object can reference the upper layers. This is why we don't just
create one path specification object. Instead we are initializing each layer of the object
one call at a time as we determine the type of image, directory, archive, etc.. we are
working with to allow our path specification object to reflect the data we are trying
to get dfVFS to work with.

Depending on what part of the dfVFS framework you are working with will determine how many
of these layers need to exist prior to calling that function with your fully developed
path specification object.

As we go father into the series I will show you how to interact with the files stored in the
partitions we listed in part 1. Doing that will create yet another layer to our object,
the file system layer. This is very similar to how we built our objects in pyTSK.

If you want to read how Metz explains Path Specification objects you can read about them
here: https://github.com/log2timeline/dfvfs/wiki/Internals

Tomorrow I will explain how we access raw images and then Thursday we will extract a file
from an image.

↧