Converting data from MySQL to MongoDB

There is no straight way of converting data from MySQL to MongoDB, but there are a few techniques that you can use.

If your database is simple and you only have individual tables that you want to import (meaning: there are no JOINs between the tables) you can very simply export your data from MySQL to a CSV file and then import that CSV file with mongoimport to MongoDB.

To import your data from MySQL to CSV you could use a query similar to this:

<

p style=”padding-left: 30px;”>SELECT * FROM books INTO OUTFILE '/tmp/books.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'

If your MySQL server is on a remote machine and you want to copy the books.csv file onto your server, have a look at the article that I posted about automating such processes using SCP and expect.

Once you have your CSV file, run this simple command (modify as needed):

mongoimport –db users –collection books –type csv –file /tmp/books.csv

The above works well for single tables or sets of tables with no relationship between them, however, as also the name suggests, relational databases usually have, well, yeah, relations. My advice is the following for converting such tables:

Write a script in your preferred language (PHP, Python, Perl, etc…) that queries these tables using the JOIN statements and returns the dataset in a JSON format, which you can save to disc and later import it using mongoimport. It’s very hard to give an example as the datasets can be diverse but at a conceptual level, see the following:

Let’s assume that I have a Books table with book titles, page numbers, and an Author ID. I also have an Author table with a name, an author ID and a publisher ID and finally, I have a Publisher table with a Publisher ID and a name. The relationship is the following:

One Publisher can have multiple Authors. One Author can only belong to one Publisher. One Author can have multiple books. As per the explanation in the previous few paragraphs of this post our MongoDB could look like this:
Book collection

{
“title”: title,
“pages”: pagenumbers,
“author”: authorID
}
Author collection

{
“name”: author-name,
“ID”: author-id
}
Publisher collection

{
“name”: publisher-name,
“authors”: [{author-id1, author-id2}]
}

Probably it’s even possible to put all this information into one big collection called ‘Books’ as opposed to creating 3 separate ones:

{
“title”: title,
“pages”: pagenumbers,
“author”: [{
“id”: authod-id,
“name”: author-name
}],
“publisher”: [
“name”: publisher-name
]
}

But for the sake of this article, we are assuming that there are 3 collections.

It is then the script’s job to return the right data in the right format from MySQL and save that. Or even, take this to the next level, the very same script could generate a JSON file and immediately call mongoimport. (If you choose this approach, I would recommend that you do a few test runs to make sure that your JSON data structure is indeed correct before importing it into MongoDB.) Here’s an example structure:

$my_file = ‘data.json’;
$con=mysqli_connect(“localhost”,”root”,”rootpwd”,”database”);
if (mysqli_connect_errno()) {
echo “Failed to connect to MySQL: ” . mysqli_connect_error();
}

$result = mysqli_query($con,”SELECT * FROM Books b JOIN Authors a ON b.author_id = a.id”);

$rows = mysqli_fetch_array($result);
print json_encode($rows);

mysqli_close($con);

$handle = fopen($my_file, ‘w’) or die(‘Cannot open file: ‘.$my_file);
fwrite($handle, $rows);

The example above should only give you an idea how this could be achieved. You probably need to massage your JSON data further.

As a summary – if you are converting your database make sure that you do proper de-normalisation, you can probably store 2 or 3 different table’s information in one collection without a problem in a form of an array or a nested object. If you are looking for automated conversions you’d have to code that on your own and format the CSV or JSON file to suit your needs.


Posted

in

,

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *