· 5 years ago · Feb 19, 2020, 11:00 PM
1
2Forensics on Twitter and WeChat Using a Customised Android Emulator Songyang Wu The Third Research Institute of Ministry of Public Security Shanghai, China e-mail: wusongyang@stars.org.cn Xin Liu The Third Research Institute of Ministry of Public Security Shanghai, China Wenqi Sun The Third Research Institute of Ministry of Public Security Shanghai, China Yong Zhang The Third Research Institute of Ministry of Public Security Shanghai, China Abstract —Public reports show that crimes linked to social networks have increased sharply in past years. The social networking activity records extracted from a suspect’s mobile device plays an important role in the investigation. However, previous works mainly conducted forensics by extracting elec- tronic evidence that is cached in mobile devices and may be deleted for various reasons. In this paper, we proposed a novel investigating approach on Twitter and WeChat based on a customised Android emulator, which can collect evidence from both local device and remote online services. Our approach was evaluated using data copies of Twitter and WeChat from differ- ent Android smartphones, which validated the practicability of the proposed method. This paper is intended to provide vital references for the investigators and researchers working on the digital forensics. Keywords-digital forensics; Android emulator; electronic evidence; investigation I.INTRODUCTION With the rapid development and popularisation of mobile networks and smartphone technologies, social networking on mobile devices has gradually become a new platform for information exchange and social inter- actions. Mobile social networking applications cater to various needs of people, such as entertainment, inter- ests, work, and communication. According to WeChat’s data report (http://blog.wechat.com/2016/12/29/the- 2016-wechat-data-report/), daily logged in users of WeChat were 7.68 hundred million on average for De- cember 2016, an increase of 35% year-over-year. How- ever, mobile social networking applications are also be- ing used by criminals to conduct a variety of crim- inal activities, such as phishing, scamming, selling il- legal goods, and disseminating pornographic materi- als. Furthermore, these applications may become com- munication tools for criminal activities, and crim- inal groups can use their rich social functionality to organise and coordinate criminal activities. Pub- lic reports (http://en.people.cn/n3/2017/0103/c90000-9162351.html) showed that the number of cases in which social networks used to conduct criminal activities has recently increased sharply. The social networking activity records extracted from a suspect’s mobile device could play an important role in investigation and legal action. Research on technology for social networking application forensic is extremely urgent. Several studies [1]-[3] have been conducted on mobile social networks forensics. However, previous works mainly conducted analyses by extracting electronic evidence stored in mobile terminals (known as ‘local data’). Onlyinvestigating local data may result in important electronic evidence being missed. For example, during usage of the phone, users or various “Junk clean” tools, such as “com.symantec.cleansweep” (https://play.google.com/store/apps/details?id=com.sym- antec.cleansweep), could remove cached images, videos or audio files that downloaded from social networks for conserving storage space. In addition, after the phone of suspects is sealed by investigators, the subsequent received conversation messages also cannot be acquired from the device. In contrast, data stored on network servers (known as ‘online data’) is more complete. However, during a digital investigation, we should try to avoid using the suspect’s mobile phone to directly access the remote online service. Although law enforcement agencies can request the service provider of social networking to collect online data evidence, but the overall process is complicated and time-consuming. Collecting online evidence is even more difficult when the electronic evidence locates another jurisdiction. Commercial digital forensic products, such as Cellebrite UFED Cloud Analyzer [4] and XRY Cloud [5], had provided the ability to gather more data beyond the device from remote social media and cloud-based sources, include Facebook, Twitter, Gmail, Google Drive and more. They could use login credentials extracted from devices to gain access to cloud sources. Compared with the above commercial products, our method is simpler. In the absence of specialized products such as UFED Cloud Analyzer, investigators could consider using our ideas to extract remote online data of certain application. An additional use case of this proposal is that the customised emulator could facilitate the presentation of evidence in the courtroom. Focusing on the aforementioned investigation require- ment, this paper proposes a novel approach, where we implement two forensics schemes (respectively for Twitter and WeChat) to collect evidences from remote services 6022018 IEEE 4th International Conference on Computer and Communications978-1-5386-8339-2/18/$31.00 ©2018 IEEE
3based on a customised Android emulator. An additional use case of this proposal is that the customised emulator could facilitate the presentation of evidence in the court room, the capability of this system could allow juries to view data in the emulated environment as if it were on the device at the time in question. The main contributions of this paper are as follows: xWe explore the common questions that arise during investigating Twitter and WeChat on Android devices including: 1) the application’s data storage scheme in the Android system; 2) how to extract the user activity records. xWe give the detailed implementation of investigating the online data of Twitter and WeChat using a customised Android emulator, while important data in suspects’ mobile device is preserved.xWe discuss how to address the legal and forensic concerns of our approach that it may alter remote data evidence either accidentally or purpose fully. xVarious versions of Twitter and WeChat, as well as several Android mobile devices, were used to evaluate the proposed solution. The experiments confirmed the applicability of our approach. II.SYSTEM OVERVIEWTwitter is a microblog and social networking appli- cation. Through Twitter, users can follow each other and send/receive tweets containing text, pictures, videos, and other multimedia contents. WeChat allows users to interact with friends, facilitates the sharing of pictures and videos, and provides text and voice instant messaging services. While using Twitter and WeChat, the Android smart- phone caches a large amount of data including the user’s account (the suspect’s account), friends list, conversation histories, sharing records, and sent or received messages. Because of some reasons discussed in section 1, the local data collected from the device maybe incomplete, while online services of Twitter and WeChat often retain activity records of the user for a period of time. If the suspect’s account could be used to login to the online services, online data could be collected for assisting the digital investigation. Figure 1 shows the framework of the proposed forensics scheme. It comprises two essential modules: device data acquisition and online data acquisition (using Android emulator). The device data acquisition module extracts Twitter and WeChat account information as well as all cached data from the mobile device. In the online data acquisition module, we built a customised Android emulator into which social networking application data extracted from the mobile device can be imported, then the application is executed on customised Android emulator and imported account data is used to login the service. An investigator can subsequently use the emulator to browse online data and cache all needed remote data in the emulator itself (the emulator can be viewed as an Android device with root privileges). Next, through an adb (Android debug bridge) connection, the online data cached in the emulator can be acquired by the device data acquisition module. In this way, the collected data could be more completely illustrate the activity logs of a suspect in the social networking. III.LEAGAL AND FORENSIC CONCERNSIn our approach, there are potential forensic concerns about that investigators have “write” access to remote sessions either accidentally or purposefully, which in some jurisdictions and scenarios, may be controversial or not allowed. In fact, this case is similar to that of “live-remote digital evidence collection”. Although the legal risk exists and case law is rare in this area, Kenneally et al. [6], [7] argued that some additional legal risks are afforded by remote live forensics and authenticity challenges could be successfully defended by thorough tool testing. Forensics schemes [8], [9] adopting live-remote forensics were also proposed in past several years. In China, the Supreme People’s Court, the Supreme People’s Procuratorate and the Ministry of Public Security jointly issued the “Provisions on Several Issues concerning the Collection, Extraction, Review and Determination of Electronic Data in the Handling of Criminal Cases” in 2016 [10]. The article 9 of this legal regulation ensures that live-remote digital evidence collection through networks is allowed if the original storage media cannot be seized or sealed. The article 5 gives various ways for protecting data integrity, including the method “video recording of relevant activities for collecting and extracting electronic data”. Despite lack of universally accepted specifications or guidelines on the issue “live-remote digital evidence collection” in international, we believe that under strict regulatory measures such as “video recording”, the legal and forensic concerns of our approach can be alleviated. However, it needs to notice that this work just merely discuss a method of extracting remote online data from the technical view. Reader needs to consider the forensic principles adhered to during undertaking the proposal method. IV.OUR INVESTIGATION SCHEMEIn this section, we first briefly explain forensics on local data in Android device, and then the proposed online data forensics scheme will be discussed in detail. Version 6.5.0 of Twitter and version 6.2.5 of WeChat were used as samples during our study. These two applications were installed on a smartphone of XiaoMi Mi-4C. In testing stage, we used several versions of the two Apps and different models of Android smartphones for evaluating the applicability of the proposed technical method. A.Local Data Forensics on Android Device (1) Storage of user data on Android device: Twitter forensics focus on information about user account, fol- lowers lists, tweets, and direct messages. After installing Twitter, the directory /data/data/com.twitter.android/ (hereafter referred to as <TwitterHome>) is created to store user data. The primary directories of evidence sources are as follows: x<TwitterHome>/shared_prefs/. This folder stores application preferences files. File com.twitter.android_preferences.xml records Twitter user ID 603
4of current account. The “TwitterID” stored in this file is useful to locate databases of corresponding account. Figure 1. The proposed forensics framework. x<TwitterHome>/databases/. This directory stores database files used by Twitter. Of these, database files with the name format [TwitterID]-[No].db store cached data generated when the user uses Twitter, including viewed tweets, search histories, and followed and following users. Database files called [TwitterID]- drafts.db store user drafts, such as tweets that were not successfully sent. x/sdcard/Android/com.twitter.android/cache/. This is the cache directory for multimedia resources, which stores received and downloaded data of various multi- media types. The main subdirectories are photos and users. The photos directory stores pictures contained in tweets and friend chat histories, and the user directory stores pictures from the Twitter account and friends. After installing WeChat on an Android device, di- rectories /data/data/com.tencent.mm (hereafter referred to as <WeChatHome>) and /sdcard/Tencent/MicroMsg/ are created. On logging in, WeChat creates a num- ber uniquely identity “uin” for each user and places a corresponding personal data folder under the path “/data/data/com.tencent.mm/MicroMsg”. The personal data folder is named using the MD5 value calculated from the user’s uin. For example, the folder name could be computed as “833dd516f909cba7ef 7d16bd9a4673e8= MD5(‘mm’+ uin)”. In the rest of this paper, the symbol <udir> is used to denote the personal folder name of a user. Thepath “/sdcard/Tencent/MicroMsg” is used to store the multimedia resources such as received images, audio files, etc. Each user has a private folder named also by computing MD5 (‘mm’+uin) under the path “/sdcard/Tencent/MicroMsg”.(2) Data acquisition: Data acquisition of Android foren- sics had been discuss in public books [11], [12] and litera- tures [6], [13], interested readers can obtain the knowledge of data collection from above mentioned documents. For our approach, the key step is to get ROOT permissions on the Android device. The ROOT tools used in our experiments are “360 Root” (http://root.360.cn/) and “Root Genius” (http://www.shuame.com/en/root/). (3) Digital investigation of Twitter on Android devices: Twitter utilises a user ID, a string of numbers, to name each user’s database. The database name usu- ally has the format ***- [Twitter ID].db. We explored the <TwitterHome>/databases folder and identified the user’s database by his/her ID. The user’s connections, communication record, search history, and activity log are stored in the user database. Table I explains major focused tables in this database. TABLE I. DATA INVESTIGATION OF TWITTER ON ANDROIDCategoryData tableExplanationFriendsusers This table includes friend’s name, login name, ID and descriptionInstant messagesmessages Each record of this table contains sender’s ID, recipient’s ID,Content of the message and timestamp. Pictures, videos orother multimedia data in content are stored as a resourceURLSearch historySearch_queriesQueried keywords and timestamp are stored Activity logsTimeline_viewActivity logs are tweets that the user posted or browsed. A Record of this view includes the tweet, the author and the Source of the tweet (4) Digital Investigation of WeChat on Android devices: In Table II, the relative path locates in the WeChat paths. B.Online Data Forensics Based on a Customized Android Emulator (1) Overview: To collect remote online data safely, we propose to login the social networking service using the suspect’s account an Android emulator. Custom modifi- cations were made to the Android virtual machine based on the suspect’s mobile device, thus Twitter or WeChat applications installed on the emulator “believe” that they still reside on the original device. Because mobile devices such as smartphones often alternate between online and offline states, the login states of social networking users always persist for a long time. During this period, we import the suspect’s login authorisation data into the Android emulator and connect the 604
5emulator to the In- ternet. Applications such as Twitter and WeChat will recognise the emulator as the suspect’s last logged Android device and login the online service without entering the corresponding username and password. The basic process of online data forensics using our customised Android emulator is shown in Figure 2. (2) Configuring the customized emulator: This study used the x86-architecture Android emulator Genymotion [16] and version 4.4.4 of the Android OS to construct the virtual runtime environment. The Android emulator had to be customised for successfully logging on social networks service using imported user account data. The major customisations are carried out as follows: Figure 2. Collect online data evidence. xThe IMEI (International Mobile Subscriber Identifi- cation Number) of the emulator must be exactly the same as the IMEI extracted from the suspect Android smartphone. Android ID and IMEI of the emulator can be directly modified through the configuration tool provided by Genymotion emulator. The Android emulator of Google writes the IMEI and other infor- mation to binary files emulator-arm.exe or emulator- x86.exe. To specify an IMEI, we have to modify these files using binary editing tools. However, the change on the execution file can cause unforeseen results during subsequent emulator execution. Therefore, using the Genymotion emulator is the better choice. xEditing the configure file /system/build.prop. Configurations about the Android runtime environment are stored in /system/build.prop, such as CPU_ABI (type of Central Processing Unit + convention of Application Binary Interface), device manufacturer, device serial number, and system version. Apps such as Twitter and WeChat collect runtime environment information during start-up, which often includes ro.product.model, ro.product.brand, ro.product.name, ro.product.device, ro.product.board, ro.build.product, ro.product.cpu.abi, and ro.product.manufacturer. The /system/build.prop file could be obtained from Android device and we usually overwrite contents of emulator’s build.prop using that of Android device. Note that access of the file /system/build.prop requires the Root privilege. xThe Android x86 ARM translation library need to be installed, which allow the native libraries (e.g., library with the .so extension) of application to work on the x86-architecture emulator. (3) Migrate application data to emulator: After prepa- ration of the emulator runtime environment, the next is to migrate the cached data of Twitter and WeChat from the Android device to the Android emulator. The cached data cannot usually be exported through back-up, such root privilege of the Android device is required. In order to ensure that permission configures of data files are not changed during the mitigation, the following procedures are recommended for data acquisition on Android device: TABLE II. DATA INVESTIGATION OF WECHAT ON ANDROIDCategoryFile/Database/FolderExplanationAccount [WeChatHome]/MicroMsg/md5/ En- MicroMsg.dbmd5 is the hash value computed from uin. EnMicroMsg.db is an encrypted SQLite database, we could learn how to de- crypt this database by referring literatures [14], [15]. Account detailed information is stored in the data table“userinfo”.Friends[WeChatHome]/MicroMsg/md5/ En- MicroMsg.dbThe userinfo data table in this database records friend and group relationships.Messages[WeChatHome]/MicroMsg/md5/ En- MicroMsg.dbMessage and userinfo tables can be used to restore conversa- tion sessions.Moments /sdcard/tencent/MicroMsg/md5/images, voices and videos embed in communications are cached in this path. Literatures [14], [15] outlined how to find them. Received documents [WeChatHome]/MicroMsg/md5/ Sns- MicroMsg.dbMessages of Moments are stored in the SnsMicroMsg.db databasewithout encryption. xTo avoid writing data to a user data partition, insert a blank MicroSD card into the device. xEnter the /data/data path through an estab- lished adb shell connection. Compress and pack- age the 605
6/data/data/com.tencent.mm folder us- ing the command tar zcvf /sdcard/tencent.tar.gz x/data/data/com.tencent.mm. For Twitter, packaging the /data/data/com.twitter.android directory using the same above mentioned command. The command specifies the path of the SD card to ensure that extracted data is written into the newly inserted MicroSD card. xExtract the tencent.tar.gz or twitter.tar.gz file from the device with the adb pull command. After acquiring application data, the following proce- dures should be used to construct the runtime environ- ment on emulator: xUse the adb command pm path to obtain the Wechat and Twitter apk installation path from the Android mobile device. Next, using the adb pull command, ex- tract the stored apk file from the device. This method ensures that the version of application installed in the emulator is the same as that on the Android device. xConnect the Android emulator through adb, and use the pm install command to install the apk files obtained from the device on the emulator. xEnter the /data/data path of the mobile device, and use the ls -l command to inspect the privileges and user group information of the com.tencent.mm and com.twitter.android directories. For instance, we assume that the revealed user and user group of xWechat directory are u0_a123 and the Twitter’s user and user group are u0_a135. xCopy extracted data of Twitter and WeChat to the x/data/data folder in the Android emulator, while the path structure of this directory is kept consistent with that of the Android device. If data files were packed using the tar command, transfer the tencent.tar.gz or twitter.tar.gz to the /data/data path of the emulator through adb connection and decompression the pack- age file using the tar zxvf filename.tar.gz command. xIn the Android emulator, the user and user group assignments of the above folders are inconsistent with that of the Android device. Therefore, we need to modify the owner assignments of imported folders com.tencent.mm and com.twitter.android in the em- ulator and ensure to maintain them consistency with the corresponding folders in the Android device. For instance (as described in item 3th), xchown − ru0a123 : u0 a123 x/data/data/com.tencent.mm xChown − ru0a135 : u0 a135 x/data/data/com.twitter.android xFor WeChat, we also port the folder /sdcard/tencent into the emulator with the same aforementioned manner. xAfter configuration is complete, double-click Twitter or WeChat on the emulator desktop to start up and login to the remove service without requirement of entering username and password. 4) Forensics after simulation: After the application begins running in the emulator, investigators can operate the Twitter and WeChat application to access remote online data as much as possible. For example, inspecting instant messages and browsing tweets. The browsed online data will be cached in the Android emulator, afterward we can conduct evidence collection and forensics on the emulator as what we do on the Android mobile device. V.TESTINGThis section begins with a case study demonstrating the effects of forensics through the Android simulator. To test the applicability of our proposed digital investigation scheme, we then used four common Android device models for evaluating. A.Case Study In first case, we illustrated the proposed scheme using Twitter, a Mi-4C Android smartphone and the Genymotion simulator. The Android version of Mi-4C is 7.1.2, on which we installed Twitter “com.twitter.android_v6.50.0- 7110077_Android-4.2.apk”. We downloaded the “Geny- motion for personal use” and create a simulator with Android v7.1.1. The Android x86 ARM translation li- brary “Genymotion-ARM-Translation v1.1” and Twit- ter for x86 “com.twitter.android 6.50.0-7160077 mi- nAPI17(x86)(nodpi) apkmirror.com.apk” were installed on the simulator. We constructed Twitter conversation sessions of group and persons on Mi-4C phone, and then we uninstalled Twitter (simultaneously remove user data) and reinstalled it. After logging in the Twitter account, we immediately turned off the phone’s networks. Thus, we got local data case that only cached partial data compared to that of remove online service. Figure shows layouts of conversation session of Twitter on the smartphone, it cleared that some pictures and videos were not be synced from the remote online service. Next, we pulled the entire Twitter data from smartphone and imported it into the Android emulator, then we setup the simulation environment following illustrations of section IV (B.Configuring the customized emulator) using the metadata of the Mi- 4C device. After finishing the simulator setup, we tried to run Twitter on the Android emulator, and Twitter login to the online service using imported account data without requiring the password. As shown in Figure 3(a), missed pictures and videos could be synced to the emulator after exploring layouts of conversation sessions on the customised Android simulator. In second case, we tested the proposed scheme using WeChat v6.2.5, a OPPO R7 smartphone and the Genymotion simulator. The setup steps were similar to that of the first case. Figure 3(b) shows the running of WeChat on the emulator, where missed multimedias could be synced to the customised Android simulator after exploring layouts of conversation sessions. 606