Official Google Cloud Certified Professional Data Engineer Study Guide. Dan Sullivan
Чтение книги онлайн.
Читать онлайн книгу Official Google Cloud Certified Professional Data Engineer Study Guide - Dan Sullivan страница 16
Table 1.4 Data definition language examples
DDL statement | Example | Explanation |
CREATE TABLE | CREATE TABLE address (address_id INT PRIMARY KEY, street_name VARCHAR(50), city VARCHAR(50), state VARCHAR(2) ); | Creates a table with four columns. The first is an integer and the primary key; the other three are variable-length character strings. |
CREATE INDEX | CREATE INDEX addr_idx ON address(state); | Creates an index on the state column of the address table. |
ALTER TABLE | ALTER TABLE address ADD (zip VARCHAR(9)); | Adds a column called zip to the address table. ALTER is also used to modify and drop entities. |
DROP INDEX | DROP INDEX addr_idx; | Deletes the index addr_idx. |
Table 1.5 Data manipulation language examples
Data Manipulation Language | ||
DML Statement | Example | Explanation |
INSERT | INSERT INTO address VALUES (1234, ’56 Main St’, ’Seattle’, ’WA’); | Adds rows to the table with the specified values, which are in column order |
UPDATE | UPDATE address SET state = ’OR’ | Sets the value of the state column to ’OR’ for all rows |
DELETE | DELETE FROM address WHERE state = ’OR’ | Removes all rows that have the value ’OR’ in the state column |
Table 1.6 Data query language examples
Data Query Language | ||
DDL statement | Example | Explanation |
SELECT … FROM | SELECT address_id, state FROM address | Returns the address_id and state values for all rows in the address table |
SELECT … FROM … WHERE | SELECT address_id, state FROM address WHERE state = ’OR’ | Returns the address_id and state values for all rows in the address table that have the value ’OR’ in the state column |
SELECT … FROM … GROUP BY | SELECT state, COUNT(*) FROM address GROUP BY state | Returns the number of addresses in each state |
SELECT … FROM … GROUP BY … HAVING | SELECT state, COUNT(*) FROM address GROUP BY state HAVING COUNT(*) > 50 | Returns the number of addresses in each state that has at least 50 addresses |
NoSQL Database Design
NoSQL databases are less structured than relational databases, and there is no formal model, like relational algebra and forms of normalization, that apply to all NoSQL databases. The four types of NoSQL databases available in GCP are
Key-value
Document
Wide column
Graph
Each type of NoSQL database is suited for different use cases depending on data ingestion, entity relationships, and query requirements.
Key-Value Data Stores
Key-value data stores are databases that use associative arrays or dictionaries as the basic datatype. Keys are data used to look up values. An example of key-value data is shown in Table 1.7, which displays a mapping from names of machine instances to names of partitions associated with each instance.
Table 1.7 Examples of key-value data
Key | Value |
Instance1 | PartitionA |
Instance2 | PartitionB |
Instance3 | PartitionA |
Instance4 | PartitionC |
Key-value data stores are simple, but it is possible to have more complex data structures as values. For example, a JSON object could be stored as a value. This would be reasonable use of a key-value data store if the JSON object was only looked up by the key, and there was no need to search on items within the JSON structure. In situations where items in the JSON structure should be searchable, a document database would be a better option.
Cloud Memorystore is a fully managed key-value data store based on Redis, a popular open source key-value datastore. As of this writing, Cloud Memorystore does not support persistence, so it should not be used for applications that do not need to save data to persistent storage. Open source Redis does support persistence. If you wanted to use Redis for a key-value store and wanted persistent storage, then you could run and manage your own Redis service in Compute Engine or Kubernetes Engine.
Document Databases
Document stores allow complex data structures, called documents, to be used as values and accessed in more ways than simple key lookup. When designing a data model for document databases, documents should be designed to group data that is read together.
Consider an online game that requires a database to store information about players’ game state. The player state includes
Player name
Health score
List of possessions
List