Deobfuscating SWF files for fun and for nostalgia

If you were on the internet anytime in the 2000's or 2010's then you would know about Adobe Flash Player (formerly Macromedia).

I personally would spend hours upon hours on Kongregate, New Grounds, and Miniclip playing games like Line Rider, Fancy Pants Adventure, and more.

Adobe Flash brought life to the internet in the form of games, movies and other interactive content that was thought to be impossible to implement with JavaScript. It provided a fully fledged environment for the development of Flash projects that was simple to use even for the less technically minded. As a result, Adobe Flash Player was one of the first things you would download after installing your web browser.

Adobe Flash was a popular choice outside of the interactivity aspect as the resulting SWF file from a Flash project were much smaller than one could achieve otherwise with assets and code being compressed with either ZLIB or LZMA compression being utilised by the resulting file itself.

Another reason for the popularity of Flash was having the ability to obfuscate and site-lock your SWF file using tools such as SecureSWF, deterring people from ripping or copying your content. This was particularly enticing for content creators and game developers as it helped provide an extra layer of security for their works.

Many years later and now Flash is a relic of the internet, with support ended by Adobe and browsers no longer allowing it to run. Attempts to keep content written for Flash Player alive are being made by Ruffle with support continuing to get better with every new release.

On the other end of the spectrum, there is also Haxe which instead provides a language and syntax similar to ActionScript (which is used in Flash files) that can target a multitude of platforms with one codebase. Haxe also has tools to help convert ActionScript codebases to Haxe which has been met with moderate success.

That's enough history lets move onto the fun stuff

One of my favourite things to do on the internet back in the late 2000's and early 2010's was hang out on Habbo Hotel, it boasted a lively community and a fun pixel art playground with games, gambling, and other activities.

Snowy Habbo Hotel

The hotel was a fun place to hang out and play games with friends.

Habbo Hotel also used to have an active private server scene with developers working on server implementations for the Habbo Hotel client to point to. The development of these private servers was quite a feat as Habbo had utilised the obfuscation and site-locking techniques above to make modification of their client extremely difficult.

JPEXS obfuscated source

This isn't really workable nor understandable but what's that in the right-most editor?

Fortunately, from 2009 to late 2011 a misconfiguration of sorts meant that the tool that was being used for obfuscation of the client still left behind some valuable metadata that could be used to reverse the obfuscated source code to an almost original state. This went under the radar in the private server community and traditional deobfuscators back then and even today don't appear to use this metadata for any form of deobfuscation.

For those who haven't worked with Flash before, in a Flash project you will typically write the logic for your game, or movie using ActionScript. ActionScript is similar to JavaScript in terms of syntax with the language itself being based on the abandoned ECMAScript 4 specification.

ActionScript is then compiled into bytecode which is interpreted by the ActionScript Virtual Machine (AVM) which is embedded in the Adobe Flash Player plugin. Because ActionScript is compiled into bytecode, tools like SecureSWF can then read and modify that bytecode to be unintelligible to humans but still valid for the underlying VM, this modification comes at a slight performance cost but provides a massive net gain in securing private codebases.

We can use disassemblers such as RABCDasm to disassemble an SWF file into a set of semi-readable assembly files. Where possible we may also use tools such as JPEXS to convert said assembly into legible ActionScript code.

So when working with the Habbo Hotel client, we can see the code is almost intelligible with most identifiers being completely mangled to the point of no longer being valid ActionScript.

The automatic deobfuscation tools provided with JPEXS also don't assist as they simply rename all these identifiers to things like variable1, variable2, variable3 and so forth, that said, when we look at the PCODE in JPEXS we can catch glimpses of the correct identifiers for the given classes, methods and more.

Armed with this knowledge we can disassemble our SWF file using RABCDAsm and start playing with the provided assembly.

→ tree | head -500

├── Habbo-0
│   ├── _-3LN.class.asasm
│   ├── _-3LN.script.asasm
│   ├── com
│   │   └── sulake
│   │       └── core
│   │           └── runtime
│   │               ├── _-0sH.class.asasm
│   │               └── _-0sH.script.asasm
│   ├── Habbo-0.main.asasm
│   ├── Habbo.class.asasm
│   ├── Habbo.script.asasm
│   ├── Logger.class.asasm
│   ├── Logger.script.asasm
├── Habbo-0.abc
├── Habbo-1
│   ├── _-00.class.asasm
│   ├── _-00i.class.asasm
│   ├── _-00i.script.asasm
│   ├── _-00k.class.asasm
│   ├── _-00K_.class.asasm
│   ├── _-00k.script.asasm
│   ├── _-00K_.script.asasm

...[SNIP]...

│   ├── _-zf
│   │   ├── _-1b9.class.asasm
│   │   ├── _-1b9.script.asasm
│   │   ├── _-1Uq.class.asasm
│   │   ├── _-1Uq.script.asasm
│   │   ├── _-Cd.class.asasm
│   │   ├── _-Cd.script.asasm
│   │   ├── _-Oa.class.asasm
│   │   └── _-Oa.script.asasm
│   ├── _-Zh.class.asasm
│   ├── _-Zh.script.asasm
│   ├── _-ZM.class.asasm
│   ├── _-ZM.script.asasm
│   ├── _-Zr.class.asasm
│   ├── _-Zr.script.asasm
│   ├── _-Zu.class.asasm
│   ├── _-Zu.script.asasm
│   ├── _-ZV.class.asasm
│   └── _-ZV.script.asasm
├── Habbo-1.abc
└── Habbo.swf

540 directories, 7522 files

The sheer amount of files can be quite intimidating

class
 refid "_-03c:_-1W9"
 instance QName(PackageNamespace("_-03c"), "_-1W9")
  extends QName(PackageNamespace("_-03c"), "_-04U")
  flag SEALED
  flag PROTECTEDNS
  protectedns ProtectedNamespace("_-6L")
  iinit
   name "com.sulake.habbo.ui.widget.messages:RoomWidgetPetCommandMessage/RoomWidgetPetCommandMessage"
   refid "_-03c:_-1W9/instance/init"
   param QName(PackageNamespace(""), "String")
   param QName(PackageNamespace(""), "int")
   param QName(PackageNamespace(""), "String")
   flag HAS_OPTIONAL
   optional Null()
   body
    maxstack 2
    localcount 4
    initscopedepth 5
    maxscopedepth 6
    code
     getlocal0
     pushscope

     getlocal0
     jump                L10

     ; 0xB0
     ; 0xC1
     ; 0x96
     ; 0x57
     ; 0x28
     ; 0x1E
L10:
     getlocal1
     constructsuper      1

     getlocal0
     getlocal2
     initproperty        QName(PrivateNamespace("_-6L"), "_-0VE")

     getlocal0
     getlocal3
     initproperty        QName(PrivateNamespace("_-6L"), "_-3Ao")

     returnvoid
    end ; code
   end ; body
  end ; method
  trait slot QName(PrivateNamespace("_-6L"), "_-0VE") type QName(PackageNamespace(""), "int") value Integer(0) end
  trait slot QName(PrivateNamespace("_-6L"), "_-3Ao") type QName(PackageNamespace(""), "String") end
  trait getter QName(PackageNamespace(""), "_-JP")
   method
    name "com.sulake.habbo.ui.widget.messages:RoomWidgetPetCommandMessage/petId/get"
    refid "_-03c:_-1W9/instance/_-JP/getter"
    returns QName(PackageNamespace(""), "int")
    body
     maxstack 1
     localcount 1
     initscopedepth 5
     maxscopedepth 6
     code
      getlocal0
      pushscope

      getlocal0
      getproperty         QName(PrivateNamespace("_-6L"), "_-0VE")
      returnvalue
     end ; code
    end ; body
   end ; method
  end ; trait
  trait getter QName(PackageNamespace(""), "value")
   method
    name "com.sulake.habbo.ui.widget.messages:RoomWidgetPetCommandMessage/value/get"
    refid "_-03c:_-1W9/instance/value/getter"
    returns QName(PackageNamespace(""), "String")
    body
     maxstack 1
     localcount 1
     initscopedepth 5
     maxscopedepth 6
     code
      getlocal0
      pushscope

      getlocal0
      getproperty         QName(PrivateNamespace("_-6L"), "_-3Ao")
      returnvalue
     end ; code
    end ; body
   end ; method
  end ; trait
 end ; instance
 cinit
  refid "_-03c:_-1W9/class/init"
  body
   maxstack 2
   localcount 1
   initscopedepth 4
   maxscopedepth 5
   code
    getlocal0
    pushscope

    findproperty        QName(PackageNamespace(""), "_-1pG")
    jump                L10

    ; 0xB0
    ; 0xD4
    ; 0xC5
    ; 0x23
    ; 0xA7
    ; 0x2B
L10:
    pushstring          "RWPCM_REQUEST_PET_COMMANDS"
    initproperty        QName(PackageNamespace(""), "_-1pG")

    findproperty        QName(PackageNamespace(""), "_-3K8")
    pushstring          "RWPCM_PET_COMMAND"
    initproperty        QName(PackageNamespace(""), "_-3K8")

    returnvoid
   end ; code
  end ; body
 end ; method
 trait const QName(PackageNamespace(""), "_-1pG") slotid 1 type QName(PackageNamespace(""), "String") value Utf8("RWPCM_REQUEST_PET_COMMANDS") end
 trait const QName(PackageNamespace(""), "_-3K8") slotid 2 type QName(PackageNamespace(""), "String") value Utf8("RWPCM_PET_COMMAND") end
end ; class

We're starting to see some patterns with identifiers here

While poking around, you may notice that the original identifiers are still within the file under what can only be described as some metadata fields within the assembly. It appears as though SecureSWF won't mangle everything by default unless you tell it to.

The fact that these metadata fields are called things like name is humorous and makes our job all the more easier.

Using this, we can begin to write a Python script to read these assembly files and build a dictionary of identifiers based on these metadata fields.

"""
Gets the namespace and classname for the package where available
defaulting to an empty string.
 
This will typically contain the string from `name` field in the
"iinit" section of the assembly file.
 
:Example:
 
>> get_namespace_and_classname("com.sulake.habbo.ui.widget.furniture.ecotronbox:EcotronBoxFurniWidget/EcotronBoxFurniWidget")
>> get_namespace_and_classname("Habbo_habboLogoClass/Habbo_habboLogoClass")
 
("com.sulake.habbo.ui.widget.furniture.ecotronbox", "EcotronBoxFurniWidget")
("", "Habbo_habboLogoClass")
"""
def get_namespace_and_classname(input: str) -> Tuple[str, str]:
    semicolon_splot = input.split(":")
 
    if len(semicolon_splot) == 1:
        classname, *_ = input.split("/")
 
        return ("", classname)
    else:
        namespace, rest = semicolon_splot
 
        classname, *_ = rest.split("/")
 
        return (namespace, classname)
 
"""
Gets the name of a given getter or setter where available
returning None if it can't be determined.
 
This will typically contain the string from "name" field in the
"trait getter" or "trait setter" section of the assembly file.
 
:Example:
 
>> get_getter_or_setter_name("_-xV9/get")
>> get_getter_or_setter_name("com.sulake.habbo.catalog.viewer:ProductContainer/firstProduct/get")
>> get_getter_or_setter_name("com.sulake.habbo.catalog.recycler:RecyclerLogic/private:statusActive/get")
>> get_getter_or_setter_name("com.sulake.habbo.ui.widget.memenu:IWidgetAvatarEffect/com.sulake.habbo.ui.widget.memenu:IWidgetAvatarEffect:isInUse/get")
 
None
"firstProduct"
"statusActive"
"isInUse"
"""
def get_getter_or_setter_name(input: str) -> Union[str, None]:
    name_splot = input.split("/")
 
    # If there aren't 3 "/" characters, we're dealing with
    # something that we typically can't handle so lets return
    # early.
    if len(name_splot) < 3:
        return None
 
    # The final part of the name will be either "get" or "set"
    # with the real method name being just before that.
    name = name_splot[-2]
 
    # Sometimes the package or other garbage will be included
    # in the name section, if it appears we can skip over it
    # by splitting on ":" and getting the last part which will be
    # the name.
    if ":" in name:
        name = name.split(":")[-1]
 
    return name

We can continue iterating on the above, adding more methods for constants, methods, and variables while noting any other patterns we see in the assembly until we have a moderately robust reader and dictionary builder.

Notable iterations encountered include:

Once we've built our dictionary of replacements, we can once again iterate over the assembly files and replace the obfuscated _- identifiers with the ones we've found in the dictionary.

"""
Given a file path, open the file and replace any identifiers found in the dictionary with
their deobfuscated counterparts.
"""
def replace_file(path: str, replacements: Dict[str, str]) -> None:
    with open(path, "r+") as reader:
        content_lines = reader.readlines()
 
        new_lines = []
 
        for index, line in enumerate(content_lines):
            if (
                not line.strip().startswith("#include")
                and not line.strip().startswith("pushstring")
                and index != 0
            ):
                for key, value in replacements.items():
                    if key in line:
                        # This handy little regex will ensure we're only working within
                        # quoted content so we don't accidentally disturb anything else
                        # in the assembly file.
                        line = re.sub(
                            rf"(^|[\"\/:]){key}([\"\/:]|$)", rf"\1{value}\2", line
                        )
 
            new_lines.append(line)
 
        # Reset the file to the beginning and truncate the content
        reader.seek(0)
        reader.truncate()
 
        # Then write the new contents to the file
        reader.writelines(new_lines)
 
"""
When ran as a script, gather a list of paths for files we want to
process and then build the replacement dictionary using our heuristics.
 
Once complete begin replacing the file contents utilising a worker per
thread to speed up the process.
"""
def main() -> None:
    paths = glob.glob("Habbo-*/**/*.class.asasm", recursive=True)
 
    replacements = build_replacements(paths)
 
    pool = mp.Pool(mp.cpu_count())
    fn = partial(replace_file, replacements=replacements)
 
    for i, _ in enumerate(pool.imap_unordered(fn, all_paths, 25)):
        print(f"Processed {i} files")
 
if __name__ == "__main__":
    main()

Now we can use RABCDasm once more to assemble the modified files into an SWF which we will then open in JPEXS to see the updated ActionScript.

JPEXS deobfuscated source

And with that we've now deobfuscated an SWF that left behind enough meaningful metadata.

So what about files that don't leave behind enough metadata?

Unfortunately, for files that don't leave behind enough meaningful metadata we can't do much other than attempt to interpret the assembly file or obfuscated ActionScript. We may be able to restore some minor items where metadata was left in QName and Namespace fields, but outside of that there isn't much to be done.

For this reason, things such as the Habbo Clients from 2012 onwards are not reversible. In the event that you have a set of reversed files from earlier versions, you may be able to build some form of static analysis tool to partially reverse the file with knowledge of its prior state.

So what can we now do with the SWF we've deobfuscated?

While Flash might be dead, you can still play Flash content using Ruffle or older builds of Electron and Pepper Flash Plugin. Alternatively, you might attempt to translate the source code into Haxe so you can compile it into a universal application for both web, mobile and desktop. You might even use the source to attempt a rewrite as a JavaScript application using PixiJS or similar.

© Lucas Smith.