Parsing the MAME XML

MAME's method of storing information about all its games is in the XML file, which is an absolute bear to look at. It's very organized, but almost to a fault. It's very huge, making it unweidly for text editors. Other applications that might want to use it also choke on its size. Luckily, it's faily easy to work around.

First thing we need is a converter to get this into a nicer format. JSON is what all the cool kids use, so we'll give that a go.

I'm using an xml2json library found here that suits my needs. We just need to get this into JSON, pick a tool that you like and run with it if you want. From there we convert from XML to JSON.

./ -t xml2json -o ../mame0179.json ../mame0179.xml --strip-text

Awesome, now we have the MAME database in JSON. So where do we go from here? For my needs, I want them split into individual files. Python to the rescue!

#!/usr/bin/python import sys, json
from pprint import pprint
with open("./mame0179.json") as json_data:
data = json.load(json_data) entries = len(data["mame"]["machine"]) for x in range(0,entries): outputfile = data["mame"]["machine"][x]["@name"] with open('./json/' + outputfile + '.json','w') as json_file: output = data["mame"]["machine"][x] pretty = json.dumps(output, sort_keys=True, indent=2) parsed = str(pretty) json_file.write(parsed)

This loops through the massive JSON file and outputs each ROM set in its own file. From here we can import our JSON into some form of a database, read the files on their own, or pretty much whatever you want at this point.

comments powered by Disqus