Kumodd

A tool for cloud storage forensics

Kumodd is a forensic tool that I created for my master’s thesis at The University of New Orleans. It is a groundbreaking solution for acquiring evidence from cloud storage services using their official APIs. It supports four major providers: Google Drive, Dropbox, OneDrive, and Box. It has both command-line and web user interfaces, and can download files, folders, revisions, and cloud-only documents. It also produces a CSV file with the metadata of all the downloaded files, and a local directory that replicates the cloud storage structure.

Why did I create kumodd?

Cloud storage services are popular among consumers and businesses for storing, sharing, and syncing files across devices. However, they also pose new challenges for digital forensic investigations, as they may contain critical evidence related to crimes or cyberattacks.

Existing forensic methods depend on analyzing the traces left by the client applications on the user’s device, which may be incomplete, outdated, or missing. Moreover, these methods require significant reverse engineering efforts for each service and version, which is tedious and prone to errors.

I wanted to create a tool that can overcome these limitations by using the APIs of the services, which are well-documented, stable, and standardized interfaces that allow the client applications to communicate with the cloud servers. By using the APIs, I can access the most recent and complete data stored in the cloud, as well as additional metadata and features that may not be available on the client side.

How did I create kumodd?

Architecture

The overall architecture of kumodd consists of three separate layers–user interface, dispatcher, and acquisition drivers. The user interface is responsible for interaction with the forensic analyst and provides both a web-based GUI (suitable for interactive exploration), and a command-line interface(suitable for automation). The dispatcher collects acquisition parameters and schedules the acquisition tasks against the forensic targets. The tasks are executed by service-specific driver modules that work via the public API supplied by the cloud provider and return the data in different, user-specified formats, such as JSON and CSV.

kumodd architectural diagram

Tools

The whole application is written in Python and uses several open-source libraries to interact with the cloud services APIs and helped in the development of the web-based GUI. Here's a list of some of the main libraries that allowed me to make kumodd happen:

  • python-gflags: This package is a an equivalent of the C++ command line flag implementation developed by Google (gflags). This package contains a library that implements command line flags processing. It includes built-ins support for Python types and allows to define flags in the source file which they are used.
  • google-api-python-client: This package contains the Python client library provided by Google for discovery-based APIs.
  • dropbox: This package contains Python specific libraries that wrap the raw HTTP calls to the Dropbox API.
  • boxsdk: This package contains Python specific libraries that wrap the raw HTTP calls to the Box API.
  • Flask: Flask is a web application framework written in Python. It is based on the Jinja2 template engine and Wekzeug toolkit. This framework is used to give back-end applications a web based interface.
  • Flask-Classy: This is an extension of Flask that adds class based views functionality.

The results

After finishing the project we ended up with a tool that has several benefits over existing methods and tools for cloud storage forensics as:

  • It can access the most recent and complete data stored in the cloud, as well as additional metadata and features that may not be available on the client side.
  • It does not require upfront substantial investment in reverse engineering each service and version, which is time-consuming and error-prone.
  • It does not rely on the web browser or the client applications to access the cloud storage accounts, which could alter or delete the evidence.
  • It is easy to use and reliable. It uses well-documented, stable, and standardized interfaces that communicate with the cloud servers.
  • It is adaptable and extensible. It can be updated to work with new features or services with modest effort.
kumodd interface

A publish-worthy work

The investigation done for creating kumodd led to a paper describes our experiences in developing cloud forensics tools for the SaaS model. We present three case studies: kumodd, kumodocs, and kumofs, which demonstrate the challenges and opportunities of cloud forensics. To read more about our take on these topics check see: Digital Investigation Journal (Volume 18, September 2016, Pages 79-95) by Elsevier.