Monday, August 26, 2019

MongoDB: All about MongoDB Replica Set

Replica set basic architecture. PRIMARY - write can happen only on primary and read can happen only on SECONDARIES

OPLOG are the files which replica set secondaries tail (like unix tail command) and read all the writes happened on PRIMARY sequentially and then run the same write queries to be in sync with PRIMARY.

Lets start with basic replicaset

mkdir /home/mongod/rs1
[mongod@sikki4u1c rs1]$  mongod --dbpath /home/mongod/rs1 --replSet r1 --logpath /var/log/mongodb/mongod_rs1.log --fork
about to fork child process, waiting until server is ready for connections.
forked process: 3286
child process started successfully, parent exiting
[mongod@sikki4u1c rs1]$

If you see the log

"2019-08-25T13:31:12.626+0000 I REPL     [initandlisten] Did not find local replica set configuration document at startup;  NoMatchingDocument: Did not find replica set configuration document in local.system.replset"

Now,we need to initiate replcaset, using rs.initiate() function.

mongo

> rs.initiate()
{
        "info2" : "no configuration specified. Using a default configuration for the set",
        "me" : "localhost:27017",
        "ok" : 1,
        "operationTime" : Timestamp(1566740045, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1566740045, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

r1:SECONDARY>  -----press enter here
r1:PRIMARY>    -----there you go, this instance is PRIMARY now.

To see the configuation mongo created for us, you can run rs.config / rs.conf

r1:PRIMARY> rs.config()
{
        "_id" : "r1",
        "version" : 1,
        "protocolVersion" : NumberLong(1),
        "members" : [
                {
                        "_id" : 0,
                        "host" : "localhost:27017",
                        "arbiterOnly" : false,
                        "buildIndexes" : true,
                        "hidden" : false,
                        "priority" : 1,
                        "tags" : {

                        },
                        "slaveDelay" : NumberLong(0),
                        "votes" : 1
                }
        ],
        "settings" : {
                "chainingAllowed" : true,
                "heartbeatIntervalMillis" : 2000,
                "heartbeatTimeoutSecs" : 10,
                "electionTimeoutMillis" : 10000,
                "catchUpTimeoutMillis" : -1,
                "catchUpTakeoverDelayMillis" : 30000,
                "getLastErrorModes" : {

                },
                "getLastErrorDefaults" : {
                        "w" : 1,
                        "wtimeout" : 0
                },
                "replicaSetId" : ObjectId("5d628e4dfb9311e2065bfcee")
        }
}
r1:PRIMARY>

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

To check the status of replica set

rs.status()

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Sunday, August 25, 2019

MongoDB Index: Rebuild and Compact

REBUILD: Mongo can build index in two ways foreground and background. If you dont mention anything, by default it uses foreground building strategy. Foreground  imposes a lock on the collection for the duration of the build.
During foreground index build, no other query or write can happen on that collection. Even few server wide operations might be affected if they need that database lock.

You can specify the backgroud build of index in the

db.myCollection.ensureIndex({....}),{background:true})

but background index build take longer time.
=========================

You can choose to index rebuild using reIndex() function on an collection from the shell. this send s the reindex command to the server, this drops and recreate all the indexes on theat collection.

if you use db repair on the database, the indexes on a collection will be recreated as well.
But db repair is not necessary for healthy databases and should not be run unless absolutely necessary.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
COMPACT: During scheduled maintainence windows , you may run compact on your collections. compact defragment an collection and rebuild it indexes. Compact is a blocking operation and takes time to complete. But running compact during maintainence windows, ensure that your collections, storage and indexes are structually optimized. Compact command is performance on a single collection. to compact our messages collection.

>db.runCommand({compact:'messages'})
{ "ok" : 1 }
>

this is sample collection with little data, but your actual collection compact can take much time. Compact is also blocking , so make sure you run it only during maintainence windows.
Please note: On replica set, you can run offline secondaries in replica set. and stepDown() primary before compacting it.


Friday, August 23, 2019

Mongodb : Types of Indexes and application

By default Mongodb create a index which is stored on Btree structure.

there are other options available to create index more application specific

1.) Covering Index: There is a subset of queries that do not require access document at all.
this happens when all the information in the query result is contained in the index itself. when this magic alignment happens query can be magical fast.
 A  query that only queries fields and returns fields contains in the index is covered by the index.

Example:
Consider the below messages collection, it has the index on the key "from.country, from.number & time". so this means index has the document value for these 3 fields already on the index.


if there is a query which only required "time" or "from.country" or "from.number" or all 3, in such cased the number of documents scanned will be zero. see below.
in below example lets exclude _ID

query example: db.messages.find({'from.country':'44'},{_id:0,time:1})

Explain example: db.messages.find({'from.country':'44'},{_id:0,time:1}).explain('executionStats')

Above stats show that mongo did not scanned any documents to get the results.

If your use case has any query that can be covered, this can provide huge performance benefit.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

2.) Sparse index:  Indexes are best when they are compact. the smaller they are the faster they can be loaded and processed. Regular mongo indexes hold an entry for every document, if a document happens to have a null value or null fields at all. the index will still include those documents under the null key. this can be a waste. if only a small portion of your documents have that fields at all. Remember mongo, has no predefined schema and it allows documents to contain arbitrarily different fields from document to document in the same collection.
Consider my case, i have a messages database, i have only a few documents tat are associated to some promotion that have promo fields.


Refer case above, where i have only 188 promo field out of my 10000 documents. So that a good  candidate for sparse index. A sparse index will only hold entry to doc if the doc actually contains the field.

Lets create a sparse index
db.messages.ensureIndex({promo:1},{sparse:true}


a sparse index is a optimization over a non-sparse index because the index size can be dramtically smaller if only only a small subset of doc contain that field.

Lets see the collection stat
As you see above the size of promo index is very small than the other indexes. Sparse index can be very helpful if index size is in MB's or GB's.

Note: there are some limitation of sparse index to decide carefully. Example sort operation can use index but if index is sparse, it means index doesn't have every document, so your result list will be shorter than it it should be. Sort should only order documents not drop ones that not have value.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

3.) Unique Index: Ensure no other document contains the same field value. Enforces uniqueness within a single collection, not across shard. Exmaple. my telephone company customers, each have a unique phone number.
Each customer have a phone number, lets ensure it is unique and remain unique by adding a unique index.

Prob 1:If you try to add a customer who have the same mobile number as an existing customer. you will get an error as
"ermsg":"E11000 duplicate key error index"

Prob 2: Even if you try to crreate a customer without any phone number, first entry will be suceeded but anoterh customer without a phoen number will give same error.
"ermsg":"E11000 duplicate key error index"

Solution !!!!!!!!!!!!!!!! HMMM how about we try to create an index which is unique and sparse. WHY?? Since the sparse index will not have the entries for documents with no phone number.  Lets try that..

Lets drop and recreate the index


We already have a customer will no phone, lets try to add other without phone mumber

WOW.It worked. We have 2 documents with no number. So thats what we wnated, if you have a number that better be unique. If you dont have a number, thats fine, and we can have multiple people with no number.

What if a customer have more that one phone number. this same idex will work fine.  Instead of single value field for number, it can be in array and mongo will still ensure that no number is duplicate.

I will add a new customer and his number, but number in array. then try to push an existing numbers. lets see


An when i try to push a unique number, it it added successfully. See the result below.


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
TTL Index: it means time to live index. Data is great and we would like to keep data forever. Well most data, and well not forever ever. Just as long as necessary. So how do we get rid of old data. traditionally we 'd have some timestamp on the some rows in the DB , then run batch job on timer outside of db to delete rows in table older than some threshold.
Mongo has a convenient way to do this without running external job. it is call TTL index.

TTL: is property on an index that defines how long document is allowed to live. after that the documnet will be subject to automatiuc removal by mongo.

Example, let create in index, to messages older than 90 days should not be kept.


expireAfterSeconds is expressed in relative terms. it doesn't set a date to expire the document. it specifies how old the document can be.

TTL index must be defined on a single field within a document. 
You cant have two TTL indexes on the same collections.
That single filed must contain a date datatype & must not be the ID field
TTL index can also be used by mongo to optize in query plan


Thursday, August 22, 2019

How ot tell if Index is usefull in Mongodb

Index is usefull if the query you are running uses it.


> db.score.find({_id:10}).explain()
{
        "queryPlanner" : {
                "plannerVersion" : 1,
                "namespace" : "movie.score",
                "indexFilterSet" : false,
                "parsedQuery" : {
                        "_id" : {
                                "$eq" : 10
                        }
                },
                "winningPlan" : {
                        "stage" : "IDHACK"
                },
                "rejectedPlans" : [ ]
        },
        "serverInfo" : {
                "host" : "sikki4u1c.mylabserver.com",
                "port" : 27017,
                "version" : "3.6.13",
                "gitVersion" : "db3c76679b7a3d9b443a0e1b3e45ed02b88c539f"
        },
        "ok" : 1
}
>

Wednesday, August 21, 2019

Indexing in MongoDB

Important to know about indexes:

1.)  _id is the default index in all collections.

See..lets see all indexes in my score collection
> db.score.getIndexes()
[
        {
                "v" : 2,                     
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "movie.score"
        }
]
>

Here "v" : 2, :: is version, internal mongo housekeeping

         "key" : {
                        "_id" : 1    -------- key is actual index description, it says index is based on the field _ID and that it is holding the value in ascending order.

 "name" : "_id_",  :::::  is the name of the index

 "ns" : "movie.score" ::::::::::::is the namespace, fully qualified name of the collection.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

2.) EnsureIndex : how to ensure that you have index on separate field.

lets see one collection what data is holds and where we can create index

> db.score.findOne()
{
        "_id" : 0,
        "name" : "aimee Zank",
        "scores" : [
                {
                        "score" : 1.463179736705023,
                        "type" : "exam"
                },
                {
                        "score" : 11.78273309957772,
                        "type" : "quiz"
                },
                {
                        "score" : 35.8740349954354,
                        "type" : "homework"
                }
        ]
}

Now lets create an index on score

> db.score.ensureIndex({scores:1})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
>
> db.score.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "movie.score"
        },
        {
                "v" : 2,
                "key" : {
                        "scores" : 1
                },
                "name" : "scores_1",
                "ns" : "movie.score"
        }
]
To drop the index

> db.score.dropIndex('scores_1')
{ "nIndexesWas" : 2, "ok" : 1 }
> db.score.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "movie.score"
        }
]

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Index with Nested document:

> db.score.findOne()
{
        "_id" : 0,
        "name" : "aimee Zank",
        "scores" : [
                {
                        "score" : 1.463179736705023,
                        "type" : "exam"
                },
                {
                        "score" : 11.78273309957772,
                        "type" : "quiz"
                },
                {
                        "score" : 35.8740349954354,
                        "type" : "homework"
                }
        ]
}
>
Let create an index in TYPE field in the SCORES document

> db.score.ensureIndex({"scores.type":1})
{
        "createdCollectionAutomatically" : false,
        "numIndexesBefore" : 1,
        "numIndexesAfter" : 2,
        "ok" : 1
}
> db.score.getIndexes()
[
        {
                "v" : 2,
                "key" : {
                        "_id" : 1
                },
                "name" : "_id_",
                "ns" : "movie.score"
        },
        {
                "v" : 2,
                "key" : {
                        "scores.type" : 1
                },
                "name" : "scores.type_1",
                "ns" : "movie.score"
        }
]


+++++++++++++++++++++++++++++++++++++++++++++++++=++++++












Tuesday, August 20, 2019

Mongoimport / MOngoexport utilities

Currently mongoimport and mongoexport supports 3 file formats.

CSV
json
tsv


systax

[mongod@sikki4u1c ~]$  mongoimport -u mongo-root -p passw0rd --db movie --collection score mongoimport.json --authenticationDatabase admin
2019-08-20T11:55:18.907+0000    connected to: localhost
2019-08-20T11:55:18.918+0000    imported 20 documents
[mongod@sikki4u1c ~]$  mongo -u mongo-root -p passw0rd
MongoDB shell version v3.6.13
connecting to: mongodb://127.0.0.1:27017/?gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("68a9cf0d-bfbc-4626-a322-0b24cb9a4811") }
MongoDB server version: 3.6.13
Server has startup warnings:
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten]
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten]
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten]
> use movie
switched to db movie
> db.score.find()
{ "_id" : 0, "name" : "aimee Zank", "scores" : [ { "score" : 1.463179736705023, "type" : "exam" }, { "score" : 11.78273309957772, "type" : "quiz" }, { "score" : 35.8740349954354, "type" : "homework" } ] }
{ "_id" : 1, "name" : "Aurelia Menendez", "scores" : [ { "score" : 60.06045071030959, "type" : "exam" }, { "score" : 52.79790691903873, "type" : "quiz" }, { "score" : 71.76133439165544, "type" : "homework" } ] }
{ "_id" : 3, "name" : "Bao Ziglar", "scores" : [ { "score" : 71.64343899778332, "type" : "exam" }, { "score" : 24.80221293650313, "type" : "quiz" }, { "score" : 42.26147058804812, "type" : "homework" } ] }
{ "_id" : 2, "name" : "Corliss Zuk", "scores" : [ { "score" : 67.03077096065002, "type" : "exam" }, { "score" : 6.301851677835235, "type" : "quiz" }, { "score" : 66.28344683278382, "type" : "homework" } ] }
{ "_id" : 4, "name" : "Zachary Langlais", "scores" : [ { "score" : 78.68385091304332, "type" : "exam" }, { "score" : 90.2963101368042, "type" : "quiz" }, { "score" : 34.41620148042529, "type" : "homework" } ] }
{ "_id" : 6, "name" : "Jenette Flanders", "scores" : [ { "score" : 37.32285459166097, "type" : "exam" }, { "score" : 28.32634976913737, "type" : "quiz" }, { "score" : 81.57115318686338, "type" : "homework" } ] }

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

if there is any update to the document and you want to append those values to tables
WE need to use --upsert option

[mongod@sikki4u1c ~]$  mongoimport -u mongo-root -p passw0rd --db movie --collection score --upsert mongo_update.json --authenticationDatabase admin
2019-08-20T11:59:10.489+0000    connected to: localhost
2019-08-20T11:59:10.491+0000    imported 2 documents
[mongod@sikki4u1c ~]$

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

To import csv file

mongoimport -u mongo-root -p passw0rd --type csv --db movie --collection score mongo_csv.csv

this may fail because it may happed that the header line has some field name, to exclude that

mongoimport -u mongo-root -p passw0rd --type csv -- headerline --db movie --collection score mongo_csv.csv

If your source file dont have any header line, the you need to specify the column name explicitly

mongoimport -u mongo-root -p passw0rd --type csv -- fields _id,name,scores --db movie --collection score mongo_csv.csv

you can also wite a fieldfile if you have to import csv frequently.

fieldfile should have all the columns

more fieldfiledemo.txt
_id
name
scores


mongoimport -u mongo-root -p passw0rd --type csv -- fieldFile fieldfiledemo.txt --db movie --collection score mongo_csv.csv


++++++++++++++++++++++++++++++++++++++++++
MONGOEXPORT UTILITY
++++++++++++++++++++++++++++++++++++++++++


[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --db movie --collection score --authenticationDatabase admin >> out.json
2019-08-20T12:14:29.929+0000    connected to: localhost
2019-08-20T12:14:29.930+0000    exported 22 records
[mongod@sikki4u1c ~]$ ls -ltr out.json
-rw-rw-r--. 1 mongod mongod 3843 Aug 20 12:14 out.json
[mongod@sikki4u1c ~]$

other way is to provide --out argument

[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --db movie --collection score --authenticationDatabase admin --out out_2.json
2019-08-20T12:15:44.853+0000    connected to: localhost
2019-08-20T12:15:44.854+0000    exported 22 records
[mongod@sikki4u1c ~]$ ls -ltr out_2.json
-rw-rw-r--. 1 mongod mongod 3843 Aug 20 12:15 out_2.json
[mongod@sikki4u1c ~]$

If you just want to export only name field

[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --db movie --collection score --fields name --authenticationDatabase admin --out out_2.json
2019-08-20T12:17:27.360+0000    connected to: localhost
2019-08-20T12:17:27.361+0000    exported 22 records
[mongod@sikki4u1c ~]$ more out_2.json
{"_id":0,"name":"aimee Zank"}
{"_id":1,"name":"Aurelia Menendez"}
{"_id":3,"name":"Bao Ziglar"}
{.
.
.
..
{"_id":18,"name":"Verdell Sowinski"}
{"_id":10,"name":"Denisha Cast"}
{"_id":19,"name":"Gisela Levin"}
{"_id":198,"name":"Timothy Harrod"}
{"_id":199,"name":"Rae Kohout"}
[mongod@sikki4u1c ~]$

exporting to json will always capture _id field  as well. But csv file will not capture


[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --db movie --collection score --fields name --type csv --authenticationDatabase admin --out out_2.csv
2019-08-20T12:19:09.816+0000    connected to: localhost
2019-08-20T12:19:09.817+0000    exported 22 records
[mongod@sikki4u1c ~]$ more out_2.csv
name
aimee Zank
Aurelia Menendez
Bao Ziglar
Corliss Zuk
Zachary Langlais
Jenette Flanders
Wilburn Spiess
Salena Olmos
Daphne Zheng
Sanda Ryba
Marcus Blohm
Quincy Danaher
Jessika Dagenais
Alix Sherrill
Tambra Mercure
Dodie Staller
Fletcher Mcconnell
Verdell Sowinski
Denisha Cast
Gisela Levin
Timothy Harrod
Rae Kohout
[mongod@sikki4u1c ~]$

NOTE: YOu must supply field names inorder to export to csv, if you dont, mongoexport will fail
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Now i have a PEEPS collections which is little complex. json can represent arbitary obj structure and nested objects. But what if need this data in CSV

> db.peeps.find().pretty()
{
        "_id" : ObjectId("5d52b9237aa191d958d087e3"),
        "name" : "dude",
        "born" : ISODate("1984-04-01T00:00:00Z"),
        "likes" : [
                "naps",
                "cake"
        ],
        "points" : 1
}
>

Need to update this part

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Selective export: This is to extract selected database. say for a query. The resultset of query can be exported

[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --db movie --collection score --query '{_id:{$gt:10}}' --authenticationDatabase admin
2019-08-20T13:09:18.333+0000    connected to: localhost

{"_id":11,"name":"Marcus Blohm","scores":[{"score":78.42617835651868,"type":"exam"},{"score":82.58372817930675,"type":"quiz"},{"score":87.49924733328717,"type":"homework"}]}
{"_id":12,"name":"Quincy Danaher","scores":[{"score":54.29841278520669,"type":"exam"},{"score":85.61270164694737,"type":"quiz"},{"score":80.40732356118075,"type":"homework"}]}
{"_id":13,"name":"Jessika Dagenais","scores":[{"score":90.47179954427436,"type":"exam"},{"score":90.3001402468489,"type":"quiz"},{"score":95.17753772405909,"type":"homework"}]}
{"_id":14,"name":"Alix Sherrill","scores":[{"score":25.15924151998215,"type":"exam"},{"score":68.64484047692098,"type":"quiz"},{"score":24.68462152686763,"type":"homework"}]}
{"_id":15,"name":"Tambra Mercure","scores":[{"score":69.1565022533158,"type":"exam"},{"score":3.311794422000724,"type":"quiz"},{"score":45.03178973642521,"type":"homework"}]}
{"_id":16,"name":"Dodie Staller","scores":[{"score":7.772386442858281,"type":"exam"},{"score":31.84300235104542,"type":"quiz"},{"score":80.52136407989194,"type":"homework"}]}
{"_id":17,"name":"Fletcher Mcconnell","scores":[{"score":39.41011069729274,"type":"exam"},{"score":81.13270307809924,"type":"quiz"},{"score":97.70116640402922,"type":"homework"}]}
{"_id":18,"name":"Verdell Sowinski","scores":[{"score":62.12870233109035,"type":"exam"},{"score":84.74586220889356,"type":"quiz"},{"score":81.58947824932574,"type":"homework"}]}
{"_id":19,"name":"Gisela Levin","scores":[{"score":44.51211101958831,"type":"exam"},{"score":0.6578497966368002,"type":"quiz"},{"score":93.36341655949683,"type":"homework"}]}
{"_id":198,"name":"Timothy Harrod","scores":[{"score":11.9075674046519,"type":"exam"},{"score":20.51879961777022,"type":"quiz"},{"score":64.85650354990375,"type":"homework"}]}
{"_id":199,"name":"Rae Kohout","scores":[{"score":82.11742562118049,"type":"exam"},{"score":49.61295450928224,"type":"quiz"},{"score":28.86823689842918,"type":"homework"}]}
2019-08-20T13:09:18.335+0000    exported 11 records

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

If there is a requirement that a file should have limited number of lines, then you can break the resultset into multiple files. For this you need to sort the id field so that is in order.

[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --skip 0 --limit 2 --db movie --collection score  --authenticationDatabase admin
2019-08-20T13:12:46.849+0000    connected to: localhost
{"_id":0,"name":"aimee Zank","scores":[{"score":1.463179736705023,"type":"exam"},{"score":11.78273309957772,"type":"quiz"},{"score":35.8740349954354,"type":"homework"}]}
{"_id":1,"name":"Aurelia Menendez","scores":[{"score":60.06045071030959,"type":"exam"},{"score":52.79790691903873,"type":"quiz"},{"score":71.76133439165544,"type":"homework"}]}
2019-08-20T13:12:46.850+0000    exported 2 records
[mongod@sikki4u1c ~]$  mongoexport -u mongo-root -p passw0rd --skip 2 --limit 2 --db movie --collection score  --authenticationDatabase admin
2019-08-20T13:12:58.994+0000    connected to: localhost
{"_id":3,"name":"Bao Ziglar","scores":[{"score":71.64343899778332,"type":"exam"},{"score":24.80221293650313,"type":"quiz"},{"score":42.26147058804812,"type":"homework"}]}
{"_id":2,"name":"Corliss Zuk","scores":[{"score":67.03077096065002,"type":"exam"},{"score":6.301851677835235,"type":"quiz"},{"score":66.28344683278382,"type":"homework"}]}
2019-08-20T13:12:58.994+0000    exported 2 records
[mongod@sikki4u1c ~]$

And so on.....








Basic Mongodb Commands

Query to insert a document:

> db.goo.insert({_id:4})
WriteResult({ "nInserted" : 1 })
>

++++++++++++++++++++++++++++++++++++++++++++
Query to select what we inserted

> db.goo.find()
{ "_id" : 4 }
>
+++++++++++++++++++++++++++++++++++++++++++++++++
If we try to insert same data again, mongo complains.


> db.goo.insert({_id:4})
WriteResult({
        "nInserted" : 0,
        "writeError" : {
                "code" : 11000,
                "errmsg" : "E11000 duplicate key error collection: demo.goo index: _id_ dup key: { : 4.0 }"
        }
})
>

++++++++++++++++++++++++++++++++++++++++++++++++++++
you cant insert the document with same ID but you can save it , using save ()

> db.goo.save({_id:4,x:1})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
>
> db.goo.find()
{ "_id" : 4, "x" : 1 }

> db.goo.save({_id:4,x:1,y:2})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.goo.find()
{ "_id" : 4, "x" : 1, "y" : 2 }
>

Lets save if i want to change the value of y.

> db.goo.save({_id:4,y:false})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.goo.find()
{ "_id" : 4, "y" : false }
>

Hey,,,, save have replace the document not updated the Y field.

Save is use to replace the document

++++++++++++++++++++++++++++++++++++++

Search doc:

-> Find a document with ID 4
> db.goo.find({_id:4})
{ "_id" : 4, "y" : false }
>

-> Find a doc with y= false

> db.goo.find({y:false})
{ "_id" : 4, "y" : false }

>> Find all doc with id=4 and y=false
> db.goo.find({_id:4,y:false})
{ "_id" : 4, "y" : false }
>

>> Find id such it is greater than 1 and less than 7
> db.goo.find({_id:{$gt:1,$lt:7}})
{ "_id" : 4, "y" : false }
>
++++++++++++++++++++++++++++++++

Updates - it does not replace the value as save()
++++++++++++++++++++++++++++++++++++
>> Find a document whoes ID is 4 and set x=1

> db.goo.update({_id:4},{$set:{x:1}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.goo.find()
{ "_id" : 4, "y" : false, "x" : 1 }
>
> db.goo.update({_id:4},{$set:{y:true}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.goo.find()
{ "_id" : 4, "y" : true, "x" : 1 }
>

Not just ID you can look for other criteria and update, here we wil increment the val by 1.

> db.goo.update({y:true},{$inc:{x:1}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.goo.find()
{ "_id" : 4, "y" : true, "x" : 2 }
>


++++++++++++++++++++++++++++++

to describe a collections

db.goo.stats()

+++++++++++++++++++++++++++++

Delete
+++++++++++++++++++++
Delete a id where id=4

> db.goo.remove({_id:4})
WriteResult({ "nRemoved" : 1 })
> db.foo.find()
>

To delete the collections

> db.goo.drop()
true
> show collections
>

Drop a database

> db
demo
> db.dropDatabase()
{ "dropped" : "demo", "ok" : 1 }
> db
demo
> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB
movie   0.000GB
>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
BSON - Binary Serialization object notation
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


> use demo
switched to db demo
> db
demo
>

> var p={name:'dude',born:ISODate("1984-04-01"),likes:['naps','cake'],points:1}
> db.peeps.save(p)
WriteResult({ "nInserted" : 1 })
> db.peeps.find()
{ "_id" : ObjectId("5d52b9237aa191d958d087e3"), "name" : "dude", "born" : ISODate("1984-04-01T00:00:00Z"), "likes" : [ "naps", "cake" ], "points" : 1 }
>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

Sort:

++++++++++++++
Here is a collections

Sort Y in Descending order

Sort first by Y then by X (here sort y ascending then X ascending)



Flip that say Y ascending and X descending


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Limit: From millions of documents, if i want to display on few of them


When i want only last 2 document, i dont want to see miiliions of record.
Note:  sort helps to arrange the data in Asc or Dsc and then fetch the data

> db.mycol.find().sort({_id:1}).limit(2)
{ "_id" : ObjectId("5d5a796833fcf969c786e4fc"), "title" : "MongoDB Overview", "description" : "MongoDB is no sql database", "by" : "tutorials point", "url" : "http://www.tutorialspoint.com", "tags" : [ "mongodb", "database", "NoSQL" ], "likes" : 100 }
{ "_id" : ObjectId("5d5a79c633fcf969c786e4fd"), "title" : "MongoDB Mideocre", "description" : "MongoDB is robust database", "by" : "tutorials point", "url" : "http://www.testurl1.com", "tags" : [ "mongodb", "database", "NoSQL" ], "likes" : 100 }
>

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




MongoDB Restore:

MOngorestore utility is used to restore the data backed up using mongodump utility.
Inorder to run mongorestore utility, user should have restore role in the database.

If you have a mongo instance running with no data and you want to restore the all of the data.

mongorestore /home/mongod/backup/dump

This will restore everything , assuming you to connect to local server, default port
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

To restore to nameserver

mongorestore --host hostname --port 27017 /home/mongod/backup/dump


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
If you want to replace documents with the documents in the backup, then use an --drop parameter.
this will drop target collection before it restore.

[mongod@sikki4u1c dump]$ mongorestore -u mongo-root -p passw0rd --drop /home/mongod/backup/dump
2019-08-19T11:22:36.620+0000    preparing collections to restore from
2019-08-19T11:22:36.626+0000    reading metadata for movie.mycol from /home/mongod/backup/dump/movie/mycol.metadata.json
2019-08-19T11:22:36.643+0000    restoring movie.mycol from /home/mongod/backup/dump/movie/mycol.bson
2019-08-19T11:22:36.656+0000    reading metadata for demo.peeps from /home/mongod/backup/dump/demo/peeps.metadata.json
2019-08-19T11:22:36.658+0000    no indexes to restore
2019-08-19T11:22:36.658+0000    finished restoring movie.mycol (3 documents)
2019-08-19T11:22:36.671+0000    restoring demo.peeps from /home/mongod/backup/dump/demo/peeps.bson
2019-08-19T11:22:36.676+0000    reading metadata for movie.movie from /home/mongod/backup/dump/movie/movie.metadata.json
2019-08-19T11:22:36.677+0000    no indexes to restore
2019-08-19T11:22:36.677+0000    finished restoring demo.peeps (1 document)
2019-08-19T11:22:36.691+0000    restoring movie.movie from /home/mongod/backup/dump/movie/movie.bson
2019-08-19T11:22:36.694+0000    no indexes to restore
2019-08-19T11:22:36.694+0000    finished restoring movie.movie (1 document)
2019-08-19T11:22:36.694+0000    restoring users from /home/mongod/backup/dump/admin/system.users.bson
2019-08-19T11:22:36.717+0000    done
[mongod@sikki4u1c dump]$

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Restore only one collection:

note: --drop with drop the collection if exists and restore it


[mongod@sikki4u1c movie]$ mongorestore -u mongo-root -p passw0rd --drop --collection mycol --db movie /home/mongod/backup/dump/movie/mycol.bson --authenticationDatabase admin
2019-08-20T11:41:57.583+0000    checking for collection data in /home/mongod/backup/dump/movie/mycol.bson
2019-08-20T11:41:57.586+0000    reading metadata for movie.mycol from /home/mongod/backup/dump/movie/mycol.metadata.json
2019-08-20T11:41:57.595+0000    restoring movie.mycol from /home/mongod/backup/dump/movie/mycol.bson
2019-08-20T11:41:57.657+0000    no indexes to restore
2019-08-20T11:41:57.657+0000    finished restoring movie.mycol (3 documents)
2019-08-20T11:41:57.657+0000    done
[mongod@sikki4u1c movie]$


+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Restore all collection to new database.

 mongorestore -u mongo-root -p passw0rd  --db restored_movie /home/mongod/backup/dump/movie/ --authenticationDatabase admin

 2019-08-20T11:44:16.072+0000    the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2019-08-20T11:44:16.073+0000    building a list of collections to restore from /home/mongod/backup/dump/movie dir
2019-08-20T11:44:16.074+0000    reading metadata for restored_movie.mycol from /home/mongod/backup/dump/movie/mycol.metadata.json
2019-08-20T11:44:16.084+0000    restoring restored_movie.mycol from /home/mongod/backup/dump/movie/mycol.bson
2019-08-20T11:44:16.086+0000    no indexes to restore
2019-08-20T11:44:16.086+0000    finished restoring restored_movie.mycol (3 documents)
2019-08-20T11:44:16.089+0000    reading metadata for restored_movie.movie from /home/mongod/backup/dump/movie/movie.metadata.json
2019-08-20T11:44:16.102+0000    restoring restored_movie.movie from /home/mongod/backup/dump/movie/movie.bson
2019-08-20T11:44:16.105+0000    no indexes to restore
2019-08-20T11:44:16.105+0000    finished restoring restored_movie.movie (1 document)
2019-08-20T11:44:16.105+0000    done
[mongod@sikki4u1c movie]$  mongo -u mongo-root -p passw0rd
MongoDB shell version v3.6.13
connecting to: mongodb://127.0.0.1:27017/?gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("ec1042f6-cfd4-4ba5-97fe-ef16218a7c8d") }
MongoDB server version: 3.6.13
Server has startup warnings:
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten]
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten]
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-08-20T05:49:51.204+0000 I CONTROL  [initandlisten]
> show dbs
admin           0.000GB
config          0.000GB
demo            0.000GB
local           0.000GB
movie           0.000GB
restored_movie  0.000GB
> use restored_movie
switched to db restored_movie
> show collections
movie
mycol
>
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
In case of replicaset , you can use --oplog option to restore

mongorestore -u mongo-root -p passw0rd  --oplogReplay /home/mongod/backup/dump --authenticationDatabase admin


Monday, August 19, 2019

MongoDB Backup

Backup is done using mongodump utility.

1.) Mongodump : This utility will backup up full mongodb in the current directory
[mongod@sikki4u1c backup]$ mongodump -u mongo-root -p passw0rd
2019-08-19T10:46:03.367+0000    writing admin.system.users to
2019-08-19T10:46:03.367+0000    done dumping admin.system.users (3 documents)
2019-08-19T10:46:03.367+0000    writing admin.system.version to
2019-08-19T10:46:03.368+0000    done dumping admin.system.version (2 documents)
2019-08-19T10:46:03.368+0000    writing movie.mycol to
2019-08-19T10:46:03.368+0000    writing demo.peeps to
2019-08-19T10:46:03.368+0000    writing movie.movie to
2019-08-19T10:46:03.369+0000    done dumping demo.peeps (1 document)
2019-08-19T10:46:03.369+0000    done dumping movie.mycol (3 documents)
2019-08-19T10:46:03.377+0000    done dumping movie.movie (1 document)
[mongod@sikki4u1c backup]$ ls -ltr
total 0
drwxrwxr-x. 5 mongod mongod 41 Aug 19 10:46 dump
[mongod@sikki4u1c backup]$ cd dump/
[mongod@sikki4u1c dump]$ ls -ltr
total 4
drwxrwxr-x. 2 mongod mongod 4096 Aug 19 10:46 admin
drwxrwxr-x. 2 mongod mongod   92 Aug 19 10:46 movie
drwxrwxr-x. 2 mongod mongod   49 Aug 19 10:46 demo
[mongod@sikki4u1c dump]$ pwd
/home/mongod/backup/dump
[mongod@sikki4u1c dump]$

++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2.) mongodump - with more option and explicitly mention the output directory

[mongod@sikki4u1c backup]$ mongodump -u mongo-root -p passw0rd --host 127.0.0.1 --port 27017 --out /home/mongod/backup/dump
2019-08-19T10:50:39.928+0000    writing admin.system.users to
2019-08-19T10:50:39.928+0000    done dumping admin.system.users (3 documents)
2019-08-19T10:50:39.928+0000    writing admin.system.version to
2019-08-19T10:50:39.929+0000    done dumping admin.system.version (2 documents)
2019-08-19T10:50:39.929+0000    writing movie.mycol to
2019-08-19T10:50:39.929+0000    writing demo.peeps to
2019-08-19T10:50:39.929+0000    writing movie.movie to
2019-08-19T10:50:39.929+0000    done dumping movie.mycol (3 documents)
2019-08-19T10:50:39.930+0000    done dumping demo.peeps (1 document)
2019-08-19T10:50:39.938+0000    done dumping movie.movie (1 document)
[mongod@sikki4u1c backup]$

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

3.) MOngodump with oplog:  tells mongodump to capture operations that occured during the backup process. this allows better point in time backup.

 mongodump -u mongo-root -p passw0rd --host 127.0.0.1 --port 27017 --out /home/mongod/backup/dump --oplog


NOTE: --oplog works only for replica sets

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4.) MOngodump for specific database:

[mongod@sikki4u1c dump]$ mongodump -u mongo-root -p passw0rd --host 127.0.0.1 --port 27017 --out /home/mongod/backup/dump --authenticationDatabase admin --db movie
2019-08-19T10:59:35.393+0000    writing movie.mycol to
2019-08-19T10:59:35.393+0000    writing movie.movie to
2019-08-19T10:59:35.394+0000    done dumping movie.movie (1 document)
2019-08-19T10:59:35.394+0000    done dumping movie.mycol (3 documents)
[mongod@sikki4u1c dump]$

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
5.) Mongodump for single collection backup

[mongod@sikki4u1c dump]$ mongodump -u mongo-root -p passw0rd --host 127.0.0.1 --port 27017 --out /home/mongod/backup/dump --authenticationDatabase admin --db movie --collection mycol
2019-08-19T11:01:26.023+0000    writing movie.mycol to
2019-08-19T11:01:26.024+0000    done dumping movie.mycol (3 documents)
[mongod@sikki4u1c dump]$

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
6.) Mongodump for multiple collection backup

This is bit tricky:

[mongod@sikki4u1c dump]$ colls=(movie mycol)
[mongod@sikki4u1c dump]$ for c in ${colls[@]}
> do
>  mongodump -u mongo-root -p passw0rd --host 127.0.0.1 --port 27017 --out /home/mongod/backup/dump --authenticationDatabase admin --db movie --collection $c
> done
2019-08-19T11:06:55.752+0000    writing movie.movie to
2019-08-19T11:06:55.753+0000    done dumping movie.movie (1 document)
2019-08-19T11:06:55.785+0000    writing movie.mycol to
2019-08-19T11:06:55.787+0000    done dumping movie.mycol (3 documents)
[mongod@sikki4u1c dump]$ ls -ltr
total 0
drwxrwxr-x. 2 mongod mongod 92 Aug 19 11:06 movie
[mongod@sikki4u1c dump]$ cd movie/
[mongod@sikki4u1c movie]$ ls
movie.bson  movie.metadata.json  mycol.bson  mycol.metadata.json
[mongod@sikki4u1c movie]$ ls -ltr
total 16
-rw-rw-r--. 1 mongod mongod 125 Aug 19 11:06 movie.metadata.json
-rw-rw-r--. 1 mongod mongod  48 Aug 19 11:06 movie.bson
-rw-rw-r--. 1 mongod mongod 184 Aug 19 11:06 mycol.metadata.json
-rw-rw-r--. 1 mongod mongod 663 Aug 19 11:06 mycol.bson
[mongod@sikki4u1c movie]$

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




Saturday, August 10, 2019

Mongo Installation & Replica set step by Step


Mongo Installation

Step 1: Create a mongo repo file in /etc/yum.repos.d

[root@web01 yum.repos.d]# more mongodb.repo
[mongodb-org-3.6]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.6/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-3.6.asc

Step 2: create mongod user & data directory

Useradd mongod
Passwd mongod

Mkdir -p /data/DATA
chown -R mongod:mongod /data/DATA

[Su to mongod]
Step 3: Create a Key File
openssl rand -base64 756 > mongo-keyfile
sudo mkdir /opt/mongo
sudo mv ~/mongo-keyfile /opt/mongo
sudo chmod 400 /opt/mongo/mongo-keyfile
sudo chown mongod:mongod /opt/mongo/mongo-keyfile


Step 4: Allow the port (iptables -Nl)
firewall-cmd --add-port=27017/tcp --permanent

firewall-cmd --reload




Step 3 : Intall Mongo

Yum install mongo

yum install -y mongodb-org

Step 4: Modify /etc/mongod.conf file
dbPath: /data/DATA
bindIp: 10.19.24.16

Step 4 : create sudoer file
vi /etc/sudoers.d/mongo
mongod ALL = (ALL) NOPASSWD: ALL
mongod ALL=(ALL) NOPASSWD: /usr/bin/systemctl start mongod
mongod ALL=(ALL) NOPASSWD: /usr/bin/systemctl stop mongod
mongod ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart mongod
mongod ALL=(ALL) NOPASSWD: /usr/bin/systemctl status mongod


Step 5: Start mongo from mongod user

sudo systemctl start mongod

Step 6:
 mongo --port 27017 --host 10.19.24.16
use admin
db.createUser({user: "mongo-admin", pwd: "password125", roles:[{role: "root", db: "admin"}]})
Successfully added user: {
        "user" : "mongo-admin",
        "roles" : [
                {
                        "role" : "root",
                        "db" : "admin"
                }
        ]
}
Successfully added user: {
        "user" : "mongo-admin",
        "roles" : [
                {
                        "role" : "root",
                        "db" : "admin"
                }
        ]
}


Step 8:

Repeat everything same on node 2 from Step 1 to Step 5

Step 9: On node 1

Vi /etc/mongod.conf
security:
 keyFile: /opt/mongo/mongo-keyfile
replication:
 replSetName: rs0


sudo systemctl restart mongod

Step 10:

mongo --port 27017 -u mongo-admin -p password125 --authenticationDatabase admin --host 10.19.24.16
rs.initiate()
Rs.status()
Rs.add("10.19.24.17:27017")

 "errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded10.19.24.17:27017; the following nodes did not respond affirmatively: 10.19.24.16:27017 failed with Authentication failed.",

rs.add( { host: "10.19.24.17:27017", priority: 0, votes: 0 } )

{
        "ok" : 1,
        "operationTime" : Timestamp(1540556102, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1540556102, 1),
                "signature" : {
                        "hash" : BinData(0,"drp3iE9HNq5ciPWaqHWedjHR9Vg="),
                        "keyId" : NumberLong("6616633136530849793")
                }
        }
}



In rs.status you can see some err " not reachable/healthy"


You need to add below to /etc/mongod.conf

security:
 keyFile: /opt/mongo/mongo-keyfile

replication:
 replSetName: rs0



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Testing if replication is working fine

rs0:PRIMARY> use ashishtest
switched to db ashishtest
rs0:PRIMARY> db
ashishtest
rs0:PRIMARY> db.movie.insert({"name":"Ashish Francis"})
WriteResult({ "nInserted" : 1 })
rs0:PRIMARY> db.movie.find()
{ "_id" : ObjectId("5bd3073ee3751e75430decfe"), "name" : "Ashish Francis" }
rs0:PRIMARY> show dbs;
admin       0.000GB
ashishtest  0.000GB
config      0.000GB
local       0.000GB
rs0:PRIMARY>


Now login to node to see if data replicated

 mongo --port 27017 -u mongo-admin -p password125 --authenticationDatabase admin --host 10.19.24.17

rs.slaveOk() -- so you can query slave node

rs0:SECONDARY> use ashishtest
switched to db ashishtest
rs0:SECONDARY> db.movie.find()
{ "_id" : ObjectId("5bd3073ee3751e75430decfe"), "name" : "Ashish Francis" }
rs0:SECONDARY>


We see the same data. Hence, replication is setup correctly.



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

For application I created a database and a user to access that db

rs0:PRIMARY> use FIRSTDB
switched to db FIRSTDB
rs0:PRIMARY> db.createUser(
...    {
...      user: "firstuser",
...      pwd: "firstuser123",
...      roles: [{role: "userAdmin", db: "FIRSTDB"}]
...    }
... );
Successfully added user: {
        "user" : "firstuser",
        "roles" : [
                {
                        "role" : "userAdmin",
                        "db" : "FIRSTDB"
                }
        ]
}


==========================================================