shutterstock_75326704

Getting Started with MongoDB and Big Data

MongoDB is an open-source, NoSQL "big data" database. MongoDB is free, so it's a popular tool among small company owners looking to scale their current database solution. As a developer, knowing MongoDB will help you in the job market. For business owners, MongoDB has several benefits for reporting, data analysis and storage.

How Does NoSQL Differ from Traditional SQL?

Traditional SQL languages have been around for decades. The three primary SQL databases are MySQL, Microsoft SQL Server, and Oracle. These databases are also called relational databases. A relational database stores data with primary and foreign keys. These keys link different tables together. If there are any errors in related data, the SQL database returns an error and data is lost.

Big data databases store data much differently than traditional SQL databases. Big data databases don't have the same constraints as a SQL database, which makes them beneficial when you know you need to store data but don't know how it will be structured.

Databases such as MongoDB store data as documents.  Think of a document as a page. When you type a document, you organize the content in your own way, but each page can have different types of headers and paragraphs. You don't know what you'll store in your document until you write it. MongoDB gives you the flexibility to store any type of data in any format on the fly. For instance, a relational database forces you to store an order that's linked to a customer. If the order has no customer relationship, the storage process fails. With MongoDB, you can store an order with no relationship to any other data.

What are the Pros and Cons of Big Data Databases?

The term "big data" is a popular buzz word in software design circles, but the advantages and benefits are difficult to understand unless you are a techie. As a business owner, you want to know the benefits in practical terms. What can it do to improve profits? What can it do to help the business grow?

Big data is a much more scalable solution than relational databases, but not all businesses need the massive amount of storage capacity and analysis benefits. You can do much more advanced querying with NoSQL and MongoDB, but if you run a small blog, you probably don't need this type of capacity. NoSQL can also slow your system if it's not implemented properly, so make sure you hire someone who understands how to set up the environment.

Ecommerce stores are probably the best choices for MongoDB implementations. MongoDB environments can create amazing reports that actually predict customer actions. Companies such as Google use big data to identify what customers are looking for and the ads to display in search for revenue. Reporting and projection analysis are probably the best benefits of MongoDB.

What Do You Need for a Big Data Environment?

When you work with big data, the idea is to store "everything." With older SQL databases, you only stored specific data that followed your business rules. With big data, you capture everything.  It doesn't matter if the data is scrubbed or formatted. This allows you to use any input from users, search engines, or customers. However, you need more storage capacity to store the massive amount of information. If you're looking into a big data solution, make sure you have the storage capacity whether it's in the cloud or local storage devices.

MongoDB works on any operating system, so at least you don't need to worry about your server platform. MongoDB works on Linux, OS X, and Windows. Most programmers work with MongoDB in a Linux environment, but more Windows developers are turning to MongoDB for big data backend solutions. It integrates well with any programming language, so the platform you use to build your websites or applications is irrelevant.

Once you have MongoDB set up, you still need the reporting and management tools. MongoDB supports command-line operations, but some developers have created GUI tools that look similar to familiar SQL interfaces. One such GUI interface is MongoVUE. This tool has an interface similar to SQL Server Management Tools where you can add, delete, and edit records without writing NoSQL statements.

For the Coders or First-Time MongoDB Users

Even new coders know a little bit of SQL, so moving to a NoSQL environment requires even some basic knowledge of the language. MongoDB integrates with JavaScript, so JS programmers will probably find it easier to learn than someone with no SQL or JS experience.

First, you should understand that MongoDB stores data as collections. Collections contain the document structures discussed earlier. You can think of collections in a similar way to tables, although the structure is different. Collections are queried for documents just like a table is queried for its records. Let's assume you have a collection called customers. You can insert customer documents into the customer collection using the following code:

customerDoc = name : "Joe Smith"
db.customer.insert ( customerDoc)

The above code inserts a document with a "name" property. The "name" property is given the value "Joe Smith." What's beneficial about MongoDB and big data is that your next record can contain any type of data even if it's completely different than the previous record. With a relational database, the data must be the same. Let's insert another record. Take a look at the following code:

customerDoc = first_name : "Joe"
db.customer.insert ( customerDoc)

In the above code, the same record is saved, but the property or field is given the name "first_name." With a relational database, this insert statement would fail. With MongoDB, the record will store in the customer collection regardless of the field name differences. MongoDB automatically creates an ID for each record you insert. The ID field is a unique alphanumeric value that you can use to return unique records using NoSQL queries. For instance, if you want to look for a specific customer through code, you can use the unique ID to differentiate a specific customer among any other records.

One final bit of code that's useful to know is the "find" function. The "find" function displays the data in your collections. Since you've inserted two documents, you can now view those two documents to verify that they've been inserted. The following code shows you how to view the two documents in the customer collection:

db.customer.find()

The above statement works with MongoDB command-line tools, but you need to assign the results to a variable if you're working with code. MongoDB works with the JavaScript language. The following code assigns the list of documents to a variable:

var data = db.customer.find()

There is a lot more to the NoSQL language than what has been covered, but these statements give you a basic start. Learning a new programming language isn't an easy task, but you can perfect it with practice and patience.

You can use MongoDB in addition to traditional SQL databases. SQL databases still have benefits. They are good for structured data and still power many of big business storage and reporting. Big data is best used in parallel to these databases when you already have established SQL servers. You can transfer data to your MongoDB and store massive amounts of data that you couldn't effectively manage with traditional databases. MongoDB is free, so experimenting with NoSQL has no charges other than server and storage resources.

Leave a Reply

Your email address will not be published. Required fields are marked *