Friday, September 13, 2019

MOngoDB : Monitoring

1.) Logfile:  /var/log/mongod/mongo.log or you can also the logs remotly from below commands.

r1:PRIMARY> show logs
global
rs
startupWarnings
r1:PRIMARY> show logs startupWarnings
global
rs
startupWarnings
r1:PRIMARY> show log startupWarnings
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten]
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten]
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] ** WARNING: This server is bound to localhost.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **          Remote systems will be unable to connect to this server.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **          Start the server with --bind_ip <address> to specify which IP
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **          addresses it should serve responses from, or with --bind_ip_all to
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **          bind to all interfaces. If this behavior is desired, start the
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **          server with --bind_ip 127.0.0.1 to disable this warning.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten]
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten]
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten]
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten] **        We suggest setting it to 'never'
2019-09-10T08:53:54.730+0000 I CONTROL  [initandlisten]
r1:PRIMARY>

2.) verbosity: if you wna tto see details logs you can set the level of verbosity.
Verbosity level numberic 0-5
Level Description
F Fatal
E Error
W Warning
I Informational, for Verbosity Level of 0
D[1-5] Debug, for All Verbosity Levels > 0


you can also set the verbosity of specific topic instead of everything
(accesscontrol, command, control, geo, index, network,query,replication,storage, journal, write)


example:  Syntax:  db.setLogLevel(0,'query')
                               db.setLogLevel(0,'index') OR db.setLogLevel(0,'replication')

note: if you dont supply a topic, by default it will enable logging for everything.

Lets see the verbosity 4 for query component.

r1:PRIMARY> db.setLogLevel(4,'query')
{
        "was" : {
                "verbosity" : 0,   ---- this was before
                "accessControl" : {
                        "verbosity" : -1
                },
                "command" : {
                        "verbosity" : -1
                },
                "control" : {
                        "verbosity" : -1
                },
                "executor" : {
                        "verbosity" : -1
                },
                "geo" : {
                        "verbosity" : -1
                },
                "index" : {
                        "verbosity" : -1
                },
                "network" : {
                        "verbosity" : -1,
                        "asio" : {
                                "verbosity" : -1
                        },
                        "bridge" : {
                                "verbosity" : -1
                        }
                },
                "query" : {
                        "verbosity" : -1
                },
                "replication" : {
                        "verbosity" : -1,
                        "heartbeats" : {
                                "verbosity" : -1
                        },
                        "rollback" : {
                                "verbosity" : -1
                        }
                },
                "sharding" : {
                        "verbosity" : -1,
                        "shardingCatalogRefresh" : {
                                "verbosity" : -1
                        }
                },
                "storage" : {
                        "verbosity" : -1,
                        "journal" : {
                                "verbosity" : -1
                        }
                },
                "write" : {
                        "verbosity" : -1
                },
                "ftdc" : {
                        "verbosity" : -1
                },
                "tracking" : {
                        "verbosity" : -1
                }
        },
        "ok" : 1,
        "operationTime" : Timestamp(1568112427, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1568112427, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
r1:PRIMARY>


It should have changed of the verbosity of query, lets check

r1:PRIMARY> db.getLogComponents()
{
        "verbosity" : 0,
        "accessControl" : {
                "verbosity" : -1
        },
        "command" : {
                "verbosity" : -1
        },
        "control" : {
                "verbosity" : -1
        },
        "executor" : {
                "verbosity" : -1
        },
        "geo" : {
                "verbosity" : -1
        },
        "index" : {
                "verbosity" : -1
        },
        "network" : {
                "verbosity" : -1,
                "asio" : {
                        "verbosity" : -1
                },
                "bridge" : {
                        "verbosity" : -1
                }
        },
        "query" : {
                "verbosity" : 4  --- query verbosity changed to 4
        },
        "replication" : {
                "verbosity" : -1,
                "heartbeats" : {
                        "verbosity" : -1
                },
                "rollback" : {
                        "verbosity" : -1
                }
        },
        "sharding" : {
                "verbosity" : -1,
                "shardingCatalogRefresh" : {
                        "verbosity" : -1
                }
        },
        "storage" : {
                "verbosity" : -1,
                "journal" : {
                        "verbosity" : -1
                }
        },
        "write" : {
                "verbosity" : -1
        },
        "ftdc" : {
                "verbosity" : -1
        },
        "tracking" : {
                "verbosity" : -1
        }


Lets see of log is created


r1:PRIMARY> use demo
switched to db demo
r1:PRIMARY> db.testData.count()
100000
r1:PRIMARY> show logs
global
rs
startupWarnings
r1:PRIMARY> show log global

2019-09-10T10:47:08.497+0000 I COMMAND  [conn21] successfully set parameter logComponentVerbosity to { query: { verbosity: 4.0 } } (was { verbosity: 0, accessControl: { verbosity: -1 }, command: { verbosity: -1 }, control: { verbosity: -1 }, executor: { verbosity: -1 }, geo: { verbosity: -1 }, index: { verbosity: -1 }, network: { verbosity: -1, asio: { verbosity: -1 }, bridge: { verbosity: -1 } }, query: { verbosity: -1 }, replication: { verbosity: -1, heartbeats: { verbosity: -1 }, rollback: { verbosity:
2019-09-10T10:48:54.741+0000 D QUERY    [LogicalSessionCacheRefresh] Using idhack: { _id: { id: UUID("b943e79b-fb79-477f-a5b7-2057ac5447c1"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) } }
2019-09-10T10:48:54.746+0000 D QUERY    [LogicalSessionCacheReap] Only one plan is available; it will be run but will not be cached. query: { lastWriteDate: { $lt: new Date(1568110734741) } } sort: { _id: 1 } projection: { _id: 1 }, planSummary: IXSCAN { _id: 1 }
2019-09-10T10:53:54.741+0000 D QUERY    [LogicalSessionCacheRefresh] Using idhack: { _id: { id: UUID("b943e79b-fb79-477f-a5b7-2057ac5447c1"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) } }
2019-09-10T10:53:54.746+0000 D QUERY    [LogicalSessionCacheReap] Only one plan is available; it will be run but will not be cached. query: { lastWriteDate: { $lt: new Date(1568111034741) } } sort: { _id: 1 } projection: { _id: 1 }, planSummary: IXSCAN { _id: 1 }
r1:PRIMARY>

See this, some extra information is written about query execution.

to unset this, u need set the verbosity to -1.

db.setLogLevel(-1,'query')
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

you can also verbosity in config file, if you want to set that it should be there after a reboot.

Say i want to set verbosity level 2 for command topic.

it needs to added under

systemLog:
       component:
           command:
                 verbosity: 2

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

3: Query Profiler: it can capture slow query so you can erview later

Mongo has 3 profiling levels, 0,1 & 2

0 is default. it means no profiling data is collected at all, 
1 only slow operations are logged
2 all operations are logged regardless of  their speed.

In level 1, you can supply a threshold in milliseconds, operations that take longer than that threshold are logged, faster operations are not.

lets enbale profiling for any operations running more than 2 millisencons

r1:PRIMARY> show profile
db.system.profile is empty
Use db.setProfilingLevel(2) will enable profiling
Use db.system.profile.find() to show raw profile entries
r1:PRIMARY> db.setProfilingLevel(1,2)
{
        "was" : 0,
        "slowms" : 100,
        "sampleRate" : 1,
        "ok" : 1,
        "operationTime" : Timestamp(1568113657, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1568113657, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
db.system.profile:
from lot of profiling data, you can look for query specific to your collections
operation type: "Query"
namespace: is the collections name



+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

4. Mongostat: lets you see metrics on several key things. Mongostat is part of the

[mongod@sikki4u2c ~]$ mongostat --port 30001
insert query update delete getmore command dirty used flushes vsize   res qrw arw net_in net_out conn set repl                time
    *0    *0     *0     *0       2     4|0  0.0% 0.0%       0 1.41G 79.0M 0|0 1|0  2.20k   65.1k   14  r1  PRI Sep 13 08:53:55.349
    *0    *0     *0     *0       0     2|0  0.0% 0.0%       0 1.41G 79.0M 0|0 1|0   416b   63.1k   14  r1  PRI Sep 13 08:53:56.350
    *0    *0     *0     *0       0     1|0  0.0% 0.0%       0 1.41G 79.0M 0|0 1|0   157b   62.5k   14  r1  PRI Sep 13 08:53:57.351
    *0    *0     *0     *0       0     4|0  0.0% 0.0%       0 1.41G 79.0M 0|0 1|0   697b   63.9k   14  r1  PRI Sep 13 08:53:58.349
    *0    *0     *0     *0       0     1|0  0.0% 0.0%       0 1.41G 79.0M 0|0 1|0   157b   62.5k   14  r1  PRI Sep 13 08:53:59.349
    *0    *0     *0     *0       0     3|0  0.0% 0.0%       0 1.41G 79.0M 0|0 1|0  1.43k   64.0k   14  r1  PRI Sep 13 08:54:00.350

^C2019-09-13T08:54:01.253+0000  signal 'interrupt' received; forcefully terminating


insert update delete :  Each one is the no of operations per second from the client. our is very quite system. nothing running. so they are all zero.

getmore  : when there is a cursor and the client asks for more data on an open cursor. 

query : it could be lower than the "getmore" in case you have not so many queries but each query is returning many many documents. wen there is too many documetns and thewy  dont just fit in the fisrt batch of the cursor then the applicatio will be issuing another "getmore" and another "getmore" for the same query, there fore the query number might be lower than the "getmore's"

flush: is the no of  disk flushes per second. "flush" relates to I/O, high flush rates means more I/O pressure.

res: is the no of megabytes of resident memore, roughly this is how much memory the mongod process consumes. this number can grow over time and can  approach the avaialbel ram, this is expected, after all mongo uses available memory to speed up operations, another reson to tun mongo on its own server.

mapped: is megabytes of data that is mapped from memory mapped files , this is relevant with the "mmapv1" the classic memory mapped file storage engine. this number can far exceed your availabel RAM size which is fine, this indications tells you how much of the data on disk was loaded at some point by mongo.

vsize: is the amount of virtual memory the mongod process is using. f

faults: IS the no of page faults per second, when data is required and is not in memory yet, it has to be brought in to the disk, while we do expect that mongo needs data from disk, high spikes here or large ramp up should be investigated. this number can go up due to no of factors, if your system starts to get bugged down by I/O this number can often be an early warning

idx miss:  refers to the percentage of attempts to load an index part that is not in memory, causing a page fault. we would like this no to be low. if the parts of the index needed overtime dont fit in memory, this number can grow. same concerns as page faults. In general IO is the slowest part, whatever you can do to minimize IO you should.

qrw arw : are the no of queued reads and writes. mongo only runs certain no of total reads and writes in given time. if more are requested by clients , those requests are queued untill current read and write complete. high queuing means high demand, if mongo cant keep up then the no of queued operations will spike.

ar aw: are the no of active reads and wrties, these are the non queued ones. looking at these and queued ones help u see if capacity is appropriate. if u see a high no of queued operations u might need to scale out or up. the no of active ops gives u a sense of how many ur server can handle because if it could handle more , the no of actives would be higher and the no of queued will lower.

net in net out :  is the bytes on th network. gives us a sense of overall traffic on the wire.

conn: this number includes all connections from other mongo replica member, application connection and etc. Connections needs to controlled, a very high no of connections would result in high memory usage.

set: shows the replica set name

repl: shows the role of member being monitored.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
5.) MongoTop : gives overall engine stats. It shows where mongo spends most of its time. it shows time spent doing reads and writes in the most active collections

[mongod@sikki4u2c ~]$ mongotop --port 30001
2019-09-13T16:20:55.590+0000    connected to: 127.0.0.1:30001

                    ns    total    read    write    2019-09-13T16:20:56Z
     admin.system.keys      0ms     0ms      0ms
    admin.system.roles      0ms     0ms      0ms
  admin.system.version      0ms     0ms      0ms
config.system.sessions      0ms     0ms      0ms
   config.transactions      0ms     0ms      0ms
         demo.testData      0ms     0ms      0ms
       demo.testData_1      0ms     0ms      0ms
              local.me      0ms     0ms      0ms
        local.oplog.rs      0ms     0ms      0ms
local.replset.election      0ms     0ms

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

db.stats() : using this fucntion we can get the disk and memory usage estimates.

connect to any database:

r1:PRIMARY> use demo
switched to db demo
r1:PRIMARY> db.stats()
{
        "db" : "demo",
        "collections" : 2,
        "views" : 0,
        "objects" : 100020,
        "avgObjSize" : 33,
        "dataSize" : 3300660,   --- in bytes
        "storageSize" : 1691648,
        "numExtents" : 0,
        "indexes" : 2,
        "indexSize" : 954368,
        "fsUsedSize" : 10444038144,
        "fsTotalSize" : 21463281664,
        "ok" : 1,
        "operationTime" : Timestamp(1568391949, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1568391949, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}

to see in megabye

db.stats(1000000)


++++++++++++++++++++++++++++++++++++++++++



This picture is what we studied: see the flow here:

1. using mongotop we found where mongo was spending lot of time,
2 using db.stats() on the collection in question  we found some fishy index that may  be not so usefull. removing this index reduced the size of indx on disk , reduces the work mongo has  to do when writing to the collection and conserve resource.
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
6.) db.serverStatus() : gather the mix of runtime statistics  and resource usgae. Mongo give statistics  and numbers resource usage, locks, timings,  =memory, data size, flush intervals, etc.

https://docs.mongodb.com/manual/reference/command/serverStatus/

entire output of db.serverStatus() can be too much, you can use specific component command
db.serverStatus().dur   --durability info
db.serverStatus().mem  -- memory statistcis
db.serverStatus().network --gives network statistics
db.serverStatus().metrics.cursor -- related to queries -- cursor gives indication of open cursor
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++




 mongostat --port 30001 --rowcount 3

 tell mongo just to print 3 rows n then exit.



0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home