Building Your First Data Science Application in MongoDB

June 23, 2017

Speaker: Robyn Allen, Software Engineer, Central Inventions 
Level: 100 (Beginner)
Track: Tutorials

To provide a hands-on opportunity to work with real data, this session will center around a web-hosted quiz application which helps students practice math and memorize vocabulary. After experimenting with a small demonstration dataset (generated by each individual during the workshop), attendees will be guided through working with an anonymized dataset in MongoDB. No prior MongoDB experience is required but attendees are expected to download and install MongoDB Community Edition (available for free from mongodb.com) and have a working Python 3 environment of their choice (e.g., IDLE, free from python.org) installed on a laptop they bring to the workshop.

Prerequisites:
Attendees are expected to bring a laptop with the following software installed:


  • MongoDB 3.4.x Community Edition

  • The text editor or IDE of their choice

  • A working Python 3 environment of their choice


No prior MongoDB experience is required.

What You Will Learn:

How to load a CSV file into MongoDB using mongoimport and then write queries (using the Mongo shell) to ensure the data appears as expected. Attendees will use a demo version of an online quiz app to generate a small data file of raw session data (which can be accessed via http://strawnoodle.com/api/testdata after logging in to the demo app and answering one or more quiz questions about MongoDB). After studying how the demo app stores session data, attendees will practice using mongoimport to import anonymized session data (provided during the workshop) into MongoDB.

How to use the aggregation pipeline (in PyMongo) to implement more complicated queries and gain insights from data. Because the sample dataset contains data from a variety of users of different skill levels, queries can be designed which reveal summary statistics for the anonymous user cohort or specific performance of individual users. Participants will receive instruction in using MongoDB aggregation pipelines in order to write powerful, efficient queries with very few lines of code.

How to write queries to analyze sample data from an online quiz app. Once the sample data has been loaded into MongoDB, participants will be guided in writing basic queries to examine the sample data. Participants will have an opportunity to write queries in the Mongo shell and in Python in order to familiarize themselves with syntax variations and key ideas. Participants will learn how to implement CRUD operations in PyMongo.

Previous Presentation
Advanced Schema Design Patterns
Advanced Schema Design Patterns

Speaker: Daniel Coupal, Senior Curriculum Engineer, MongoDB Level: 200 (Intermediate) Track: Developer At ...

Next Presentation
Powering Real Estate Property Analytics with MongoDB + Spark
Powering Real Estate Property Analytics with MongoDB + Spark

Speaker: Gheni Abla, Analytics Software Technical Architect, CoreLogic Level: 200 (Intermediate) Track: Da...