NoSQL Intro

 

NoSQL = Not Only SQL ( Structured Query Language )
It was telling that "NOT ONLY STRUCTURED QUERIES would exist in data science"

With Cloud / Big Data / IoT , new concepts arrived.   Volume, Variety, Velocity
RDBMS could not respond well to these needs.

- Data could be variant. This means UNSTRUCTURED DATA.  RDBMS like STRUCTURED DATA.
- Data volume may have any size like PBs (Peta Bytes)   RDBMS does not operate well with PB data sizes.
  Also data continously grows but RDBMS scalability has finite limits.
  RDBMS generally scale up, for infinite scalability scaling out is needed.
  For scaling out, NoSQL systems always work with replicated distributed(partitions/shards) data.
  Data replication is built-in and it is a MUST, it is not an OPTION.

- Data velocity is high so DB must always perform well.
  RDBMS can have complex long-running queries making JOINS or FULL TABLE SCANS etc.

  JOINS are generally prohibited at NoSQL systems.
  Design is simpler at NoSQL, so simplicity brings better performance.
  (NoSQL systems are Key-Value Stores. Key-Value logic is simpler. NO JOIN is also another simplicity.)
  Query performances are generally predictable and faster.
  (No surprises like full table scans makes results more predictable)

NoSQL systems are designed for such big data processing needs.
NoSQL systems care PERFORMANCE a lot, while RDBMS systems care CONSISTENCY a lot.
(More you care about CONSISTENCY, more you get LATENCY at NoSQL systems.)
RDBMS systems are generally READ-OPTIMIZED while NoSQL Systems are READ & WRITE OPTIMIZED.

 - NoSQL systems have flexible schema.  (If your app still needs rigid schema rules, those are at application scope with NoSQL.)
( app can implement SCHEMA on READ SEMANTIC, you can make any projections on necessary data before reading.)
 - NoSQL systems are generally Write-Once-Read-Many systems.
 - Relations between tables etc are not so rigid like RDBMS in NoSQL.

NoSQL systems are categorized as KEY-VALUE-STORE / DOCUMENT-STORE / COLUMN-STORE / GRAPH-BASED
Only Graph-based NoSQL systems carry relationships inside graphs. ( but it is not as rigid as RDBMS )
Other 3 NoSQL systems does not carry data relationship. (These 3 are called AGGREGATE STORES)
Data relationships may be important at AI applications or Fraud Applications or Recommendation Engine Apps etc.
Graph databases also has graph types like "property graphs" / "hypergraphs" / "RDF triple stores"

Sample DB names for each NoSQL category are written below.
                        LevelDB,Oracle NoSQL,Redis                            KEY-VALUE-STORE NOSQL DB
                        MongoDB,CouchDB                                                 DOCUMENT-STORE NOSQL DB
                        BigTable,Cassandra,HBase                                   COLUMN-STORE NOSQL DB
                        DataStax,Neo4j,OrientDB, Hashgraph         GRAPH-BASED NOSQL DB

- NoSQL systems are essentially Key-Value (KV) Stores. They are easy and fast.
- Document-based systems are more advanced KV stores with more advanced features.
  You can store any kind of hierarchical data with Document-based solutions.
- Column based solutions can store many column if application needs. 
  Eg, with Oracle database, you can't create a table more than 1000 columns.
 At HBase, you can create tables with millions of rows and millions of columns. 
- Graph based solutions keep relations.  Those information can be usefull for AI. GREMLIN API can be used for querying.

NoSQL native data modelling language is JSON ( JavaScript Object Notation ) But there may be some modifications like BSON ( binary JSON ) of MongoDB. Each DB can also have its special query languages like Cassandra CQL, HBase shell , HBase HiveQL etc...
NoSQL DB alternatives are still constrained against CAP theorem.It can also be a criteria for you to choose your NoSQL database. For example, BigTable care CONSISTENCY a lot. But MongoDB cares AVAILABILITY more than CONSISTENCY in the event of NETWORK PARTITIONING. Cassandra optimize for AVAILABILITY,LATENCY and RELAXED-CONSISTENCY
At some DB solutions, those can be configured with tunables about how much to care about each parameter.