Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together,

Similar presentations


Presentation on theme: "Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together,"— Presentation transcript:

1

2 Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. Like queries, aggregation operations in MongoDB use collections of documents as an input and return results in the form of one or more documents. MongoDB provides three ways to perform aggregation: Aggregation pipeline Map-reduce function & Single purpose aggregation methods and commands

3 1.Aggregation Pipeline The aggregation pipeline is a framework for performing aggregation tasks, modeled on the concept of data processing pipelines. Pipeline stages provide filters that operate like queries and document transformations that modify the form of the output document. The pipeline transforms the documents into aggregated results, and is accessed through the aggregate database command.

4 1.Aggregation Pipeline Contd…
The MongoDB aggregation pipeline starts with the documents of a collection and streams the documents from one pipeline operator to the next to process the documents. Each operator in the pipeline transforms the documents as they pass through the pipeline. The db.collection.aggregate() method returns a cursor and can return result sets of any size. This method take pipeline operator and pipeline expression as a parameter.

5 1.Aggregation Pipeline Contd…
Pipeline Operators Pipeline operators appear in an array. Documents pass through the operators in a sequence.

6 1.Aggregation Pipeline Contd…
Pipeline Expressions: Each pipeline operator takes a pipeline expression as its operand. Pipeline expressions specify the criteria to apply to the input documents. Expressions have a document structure and can contain fields, values, and operators. Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other documents.

7 1.Aggregation Pipeline Contd…
Expression Operators

8 1.Aggregation Pipeline Contd…

9 2.Map Reduce Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For map-reduce operations, MongoDB provides the mapReduce database command. Map-reduce operations take the documents of a single collection as the input and can perform any arbitrary sorting and limiting before beginning the map stage.

10 2.Map Reduce contd… In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further condense or process the results of the aggregation.

11 2.Map Reduce contd…

12 2.Map Reduce contd… Map Reduce Concurrency
The map-reduce operation is composed of many tasks like: including reads from the input collection, executions of the map function, executions of the reduce function, writes to a temporary collection during processing, and writes to the output collection. Map reduce operation can be applied multiple times on documents.

13 3.Single Purpose Aggregation Operations
For a number of common single purpose aggregation operations, MongoDB provides special purpose database commands. These common aggregation operations are: Distinct Count Grouping

14 3.Single Purpose Aggregation Operations

15 3.Single Purpose Aggregation Operations
Count The count() command as well as cursor.count() methods provide access to counts in the mongo shell. Collection records - { a: 1, b: 0 } { a: 1, b: 1 } { a: 1, b: 4 } { a: 2, b: 2 } db.records.count() db.records.count( { a: 1 } )

16 3.Single Purpose Aggregation Operations
Group The group operation takes a number of documents that match a query, and then collects groups of documents based on the value of a field or fields. It returns an array of documents with computed results for each group of documents. Access the grouping functionality via the group command or the db.collection.group() method in the mongo shell.

17 3.Single Purpose Aggregation Operations
Example of Group Collection records { a: 1, count: 4 } { a: 1, count: 2 } { a: 2, count: 3 } { a: 2, count: 1 } { a: 1, count: 5 } { a: 4, count: 4 } db.records.group ( { key: { a: 1 }, reduce: function(cur, result) {result.count+= cur.count }, initial: { count: 0 } } ) The results of this group operation would resemble the following: [{ a: 1, count: 15 }]

18 Indexing Indexes provide high performance read operations for frequently used queries. Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must scan every document in a collection to select those documents that match the query statement. MongoDB defines indexes at the collection level and supports indexes on any field or sub-field of the documents in a MongoDB collection.

19

20 Type of Indexing 1. Default _id 2. Single Field Indexes
3. Compound Indexes 4. MultiKey Indexes 5. Text Indexes 6. Geospatial Indexes 7. Hashed Indexes

21 1. Default _id All MongoDB collections have an index on the _id field that exists by default. If applications do not specify a value for _id the driver or the mongod will create an _id field with an ObjectId value. The _id index is unique, and prevents clients from inserting two documents with the same value for the _id field.

22 db.user.ensureIndex( { "address.zipcode": 1 } )
2. Single Field Indexes Single field index only includes data from a single field of the documents in a collection. Example: Collection: user {"_id": ObjectId(...) "name": "John Doe" "address": { "street": "Main", "zipcode": "53511", } } Index on Single Field: db.user.ensureIndex( { name : 1 } ) … Ascending db.user.ensureIndex({ name : -1 } ) … Descending Index on Embedded Field(Sub-Document): db.user.ensureIndex( { "address.zipcode": 1 } )

23 db.products.ensureIndex( { "item": 1, "stock": -1 } )
3. Compound Indexes A Compound index includes more than one field of the documents in a collection. The order of the fields in a compound index is very important. Example: { "_id": ObjectId(...), "item": "Banana", "category": ["food", "produce", "grocery"], "location": "4th Street Store", "stock": 4, "type": "cases", } Compound Index on item and stock is: db.products.ensureIndex( { "item": 1, "stock": -1 } )

24 db.user.ensureIndex( { “subject.dmsa": 1 } )
4. Multikey Indexes MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically determines whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify the multikey type. db.user.ensureIndex( { “subject.dmsa": 1 } )

25 db.reviews.ensureIndex( { comments: "text" } )
5. Text Indexes Text indexes supports search of string content in documents. Text indexes can include any field whose value is a string or an array of string elements. To perform queries that access the text index, use the $text query operator. To create a text index, use the db.collection.ensureIndex() method. To index a field that contains a string or an array of string elements, include the field and specify the string literal "text" in the index document. db.reviews.ensureIndex( { comments: "text" } )

26 db.collection.ensureIndex( {subject: "text", content: "text"})
5. Text Indexes Text Index on Specific Fields Creates a text index on the fields subject and content: db.collection.ensureIndex( {subject: "text", content: "text"}) Text Index on All Fields To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain string content. Example: Indexes any string value in the data of every field of every document in collection and name the index TextIndex: db.collection.ensureIndex( { "$**": "text" }, { name: "TextIndex" })

27 Index Properties TTL(Time To Live) Indexes:
The TTL index is used for TTL collections, which expire data after a period of time. TTL indexes are special indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. This is ideal for some types of information like machine generated event data, logs, and session information that only need to persist in a database for a limited amount of time. db.log_events.createIndex({"createdAt":1}, { expireAfterSeconds: 3600 } )

28 db.members.ensureIndex( { "user_id": 1 }, { unique: true } )
Index Properties Unique Indexes: A unique index causes MongoDB to reject all documents that contain a duplicate value for the indexed field. To create a unique index, use the db.collection.ensureIndex() method with the unique option set to true. By default, unique is false on MongoDB indexes. db.members.ensureIndex( { "user_id": 1 }, { unique: true } ) Drop Duplicates Force MongoDB to create a unique index by deleting documents with duplicate values when building the index. db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } )

29 db.addresses.ensureIndex( { “user_id": 1 }, { sparse: true } )
Index Properties Sparse Indexes: A sparse index does not index documents that do not have the indexed field. i.e Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index is “sparse” because it does not include all documents of a collection. To create a sparse index, use the db.collection.ensureIndex() method with the sparse option set to true. db.addresses.ensureIndex( { “user_id": 1 }, { sparse: true } )

30 Remove Index To remove an index from a collection use the dropIndex() method and the following procedure. Remove a Specific Index db.accounts.dropIndex( { “user_id": 1 } ) Remove All Indexes except for the _id index from a collection db.collection.dropIndexes()

31 Rebuild Indexes If you need to rebuild indexes for a collection you can use the db.collection.reIndex() method to rebuild all indexes on a collection in a single operation. This operation drops all indexes, including the _id and then rebuilds all indexes.

32 Return a List of All Indexes
List all Indexes on a Collection To return a list of all indexes on a collection, use the db.collection.getIndexes() method Example: To view all indexes on the user collection: db.user.getIndexes() List all Indexes for a Database To return a list of all indexes on all collections in a database: db.system.indexes.find()

33 Thank You


Download ppt "Aggregation Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together,"

Similar presentations


Ads by Google