OS knowledge syllabus 2

. . .

Session7-SQL

  • What is data modeling? Why do we need it? When would you need it?
  • What is primary key? How is it different from unique key?
  • What is normalization? Why do you need to normalize?
  • What does data redundancy mean? Can you give an example of each?
  • What is database integrity? Why do you need it?
  • What are joins and explain different types of joins in detail.
  • Explain indexes and why are they needed?
  • If we have 1B data in our relational database and we do not want to fetch all at once. What are the ways that we can partition the data rows?

Explain clustered and non-clustered index and their differences.

1. Clustered Index

Definition

A clustered index determines the physical order of data storage in a table. In other words, the rows of the table are physically arranged on disk in the order of the clustered index key. A table can have only one clustered index because there can be only one physical ordering of the data rows.

How it Works

  • Index Structure: The clustered index is often implemented as a B - tree data structure. The leaf nodes of the B - tree contain the actual data rows of the table, sorted according to the index key.
  • Data Retrieval: When you query data using the columns in the clustered index, the database can quickly locate the relevant rows because they are physically stored in the order of the index. For example, if you have a Customers table with a clustered index on the CustomerID column, and you query for a specific CustomerID, the database can efficiently navigate through the B - tree to find the corresponding row.

Example

-- Create a table with a clustered index on the ID column
CREATE TABLE Products (
ProductID INT PRIMARY KEY CLUSTERED,
ProductName VARCHAR(100),
Price DECIMAL(10, 2)
);

In this example, the ProductID column is the clustered index. The rows in the Products table will be physically sorted by the ProductID value.

2. Non - Clustered Index

Definition

A non - clustered index is a separate structure from the actual data rows. It contains a copy of the indexed columns and a pointer to the location of the corresponding data row in the table. A table can have multiple non - clustered indexes.

How it Works

  • Index Structure: Similar to a clustered index, a non - clustered index is also typically implemented as a B - tree. However, the leaf nodes of the non - clustered index do not contain the actual data rows but rather pointers to the data rows in the table.
  • Data Retrieval: When you query data using the columns in a non - clustered index, the database first searches the non - clustered index to find the pointers to the relevant data rows. Then it uses these pointers to access the actual data rows in the table. This additional step of accessing the data rows can make non - clustered index lookups slightly slower than clustered index lookups for large datasets.

Example

-- Create a table
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE
);

-- Create a non - clustered index on the CustomerID column
CREATE NONCLUSTERED INDEX idx_CustomerID ON Orders (CustomerID);

In this example, the idx_CustomerID is a non - clustered index on the CustomerID column. The index stores the CustomerID values and pointers to the corresponding rows in the Orders table.

3. Differences between Clustered and Non - Clustered Indexes

Physical Order of Data

  • Clustered Index: Determines the physical order of data storage in the table. The data rows are physically sorted according to the clustered index key.
  • Non - Clustered Index: Does not affect the physical order of data in the table. It is a separate structure that points to the data rows.

Number of Indexes per Table

  • Clustered Index: A table can have only one clustered index because there can be only one physical ordering of the data.
  • Non - Clustered Index: A table can have multiple non - clustered indexes. You can create non - clustered indexes on different columns or combinations of columns to improve query performance for various types of queries.

Storage Space

  • Clustered Index: Since it stores the actual data rows, it generally requires more storage space compared to a non - clustered index.
  • Non - Clustered Index: Stores only the indexed columns and pointers to the data rows, so it usually requires less storage space.

Query Performance

  • Clustered Index: Is very efficient for range queries (e.g., retrieving all rows where the index value is between a certain range) because the data is physically sorted. It also has an advantage for queries that return a large number of rows.
  • Non - Clustered Index: Is useful for queries that filter on a small subset of data using the indexed columns. However, for queries that need to access a large number of rows, the additional step of following the pointers to the data rows can make it slower than using a clustered index.

Insert, Update, and Delete Operations

  • Clustered Index: Inserting, updating, or deleting rows can be more expensive because it may require re - arranging the physical order of the data on disk.
  • Non - Clustered Index: These operations are generally less expensive because they only involve updating the non - clustered index structure and the pointers, without affecting the physical order of the data.

What are normal forms

In the context of databases, “NF” usually stands for “Normal Form”. Normal forms are used in database design to organize data in a way that reduces data redundancy, improves data integrity, and makes the database more efficient and easier to manage. Some of the commonly known normal forms are:

  • First Normal Form (1NF): A relation is in 1NF if it has atomic values, meaning that each cell in the table contains only a single value and not a set of values. For example, a table where a column stores multiple phone numbers separated by commas would not be in 1NF.
  • Second Normal Form (2NF): A relation is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the primary key. This means that no non-key attribute should depend only on a part of the primary key in case of a composite primary key.
  • Third Normal Form (3NF): A relation is in 3NF if it is in 2NF and there is no transitive dependency of non-key attributes on the primary key. That is, a non-key attribute should not depend on another non-key attribute.
  • Boyce-Codd Normal Form (BCNF): BCNF is a stronger version of 3NF. A relation is in BCNF if for every functional dependency X → Y, X is a superkey. In other words, every determinant must be a candidate key.
  • Fourth Normal Form (4NF): A relation is in 4NF if it is in BCNF and there are no non-trivial multivalued dependencies.

1. Examples of Normalization

First Normal Form (1NF)

Original Table (Not in 1NF):
Suppose we have a Students table that stores information about students and their hobbies.

Student ID Student Name Hobbies
1 John Reading, Painting
2 Jane Singing, Dancing

The Hobbies column contains multiple values separated by commas, which violates 1NF.

Converted to 1NF:
We create a new table structure.
Students Table:

Student ID Student Name
1 John
2 Jane

StudentHobbies Table:

Student ID Hobby
1 Reading
1 Painting
2 Singing
2 Dancing

Second Normal Form (2NF)

Original Table (Violating 2NF):
Consider an Orders table with a composite primary key (Order ID, Product ID).

Order ID Product ID Product Name Order Quantity
1 101 Laptop 2
1 102 Mouse 3
2 101 Laptop 1

The Product Name depends only on the Product ID (part of the composite primary key), violating 2NF.

Converted to 2NF:
Products Table:

Product ID Product Name
101 Laptop
102 Mouse

OrderDetails Table:

Order ID Product ID Order Quantity
1 101 2
1 102 3
2 101 1

Third Normal Form (3NF)

Original Table (Violating 3NF):
Let’s have an Employees table.

Employee ID Department ID Department Name Employee Salary
1 1 IT 5000
2 1 IT 6000
3 2 HR 4500

The Department Name is transitively dependent on the Employee ID through the Department ID, violating 3NF.

Converted to 3NF:
Departments Table:

Department ID Department Name
1 IT
2 HR

Employees Table:

Employee ID Department ID Employee Salary
1 1 5000
2 1 6000
3 2 4500

2. Examples of Database Integrity

Entity Integrity

  • Explanation: Ensures that each row in a table is uniquely identifiable, usually through a primary key.
  • Example: In a Customers table, the Customer ID is set as the primary key.
    CREATE TABLE Customers (
    Customer ID INT PRIMARY KEY,
    Customer Name VARCHAR(100),
    Email VARCHAR(100)
    );

If you try to insert a new row with an existing Customer ID, the database will reject the insert operation because it violates entity integrity.

Referential Integrity

  • Explanation: Maintains the consistency between related tables. A foreign key in one table must match a primary key value in another table.
  • Example: Consider a Orders table and a Customers table. The Orders table has a foreign key Customer ID that references the Customer ID in the Customers table.
    CREATE TABLE Customers (
    Customer ID INT PRIMARY KEY,
    Customer Name VARCHAR(100)
    );

    CREATE TABLE Orders (
    Order ID INT PRIMARY KEY,
    Customer ID INT,
    Order Date DATE,
    FOREIGN KEY (Customer ID) REFERENCES Customers(Customer ID)
    );

If you try to insert an order with a Customer ID that does not exist in the Customers table, the database will not allow it due to referential integrity.

Domain Integrity

  • Explanation: Ensures that the data entered into a column falls within an acceptable range of values.
  • Example: In a Products table, the Price column should only accept positive values.
    CREATE TABLE Products (
    Product ID INT PRIMARY KEY,
    Product Name VARCHAR(100),
    Price DECIMAL(10, 2) CHECK (Price > 0)
    );

If you try to insert a product with a negative price, the database will reject the insert because it violates domain integrity.

How do you represent a multi-valued attribute in a database?

A multi - valued attribute is an attribute that can have multiple values for a single entity. Here are the common ways to represent multi - valued attributes in different types of databases:

Relational Databases

1. Using a Separate Table (Normalization Approach)

This is the most common and recommended method in relational databases as it adheres to the principles of database normalization.

Steps:

  • Identify the Entities and Attributes: Suppose you have an Employees entity with a multi - valued attribute Skills. An employee can have multiple skills, so the Skills attribute is multi - valued.
  • Create a New Table: Create a new table to store the multi - valued data. This table will have a foreign key that references the primary key of the main entity table.
  • Define the Schema:

    -- Create the Employees table
    CREATE TABLE Employees (
    employee_id INT PRIMARY KEY AUTO_INCREMENT,
    employee_name VARCHAR(100)
    );

    -- Create the Skills table
    CREATE TABLE Skills (
    skill_id INT PRIMARY KEY AUTO_INCREMENT,
    employee_id INT,
    skill_name VARCHAR(50),
    FOREIGN KEY (employee_id) REFERENCES Employees(employee_id)
    );
  • Insert and Query Data:

    -- Insert an employee
    INSERT INTO Employees (employee_name) VALUES ('John Doe');

    -- Insert skills for the employee
    INSERT INTO Skills (employee_id, skill_name) VALUES (1, 'Java');
    INSERT INTO Skills (employee_id, skill_name) VALUES (1, 'Python');

    -- Query all skills of an employee
    SELECT skill_name
    FROM Skills
    WHERE employee_id = 1;

2. Using Delimited Lists (Denormalization Approach)

In some cases, for simplicity or performance reasons, you may choose to use delimited lists to represent multi - valued attributes.

Steps:

  • Modify the Main Table: Instead of creating a separate table, you add a single column to the main table and store multiple values separated by a delimiter (e.g., comma).

    -- Create the Employees table with a multi - valued attribute as a delimited list
    CREATE TABLE Employees (
    employee_id INT PRIMARY KEY AUTO_INCREMENT,
    employee_name VARCHAR(100),
    skills VARCHAR(200)
    );
  • Insert and Query Data:

    -- Insert an employee with skills
    INSERT INTO Employees (employee_name, skills) VALUES ('John Doe', 'Java,Python');

    -- Query employees with a specific skill
    SELECT *
    FROM Employees
    WHERE skills LIKE '%Java%';

However, this approach has several drawbacks. It violates the first normal form of database normalization, making it difficult to perform data manipulation and queries, and it can lead to data integrity issues.

Non - Relational Databases

1. Document Databases (e.g., MongoDB)

In document databases, multi - valued attributes can be easily represented as arrays within a document.

Steps:

  • Define the Document Structure: Create a collection and define the document structure to include an array for the multi - valued attribute.

    // Insert a document in the Employees collection
    db.employees.insertOne({
    employee_name: 'John Doe',
    skills: ['Java', 'Python']
    });
  • Query Data:

    // Query employees with a specific skill
    db.employees.find({ skills: 'Java' });

2. Graph Databases (e.g., Neo4j)

In graph databases, multi - valued attributes can be represented as relationships between nodes.

Steps:

  • Create Nodes and Relationships: Create nodes for the main entity and the values of the multi - valued attribute, and then create relationships between them.

    // Create an employee node
    CREATE (:Employee {name: 'John Doe'})
    // Create skill nodes
    CREATE (:Skill {name: 'Java'})
    CREATE (:Skill {name: 'Python'})
    // Create relationships between the employee and skills
    MATCH (e:Employee {name: 'John Doe'}), (s1:Skill {name: 'Java'}), (s2:Skill {name: 'Python'})
    CREATE (e)-[:HAS_SKILL]->(s1)
    CREATE (e)-[:HAS_SKILL]->(s2);
  • Query Data:

    // Query all skills of an employee
    MATCH (e:Employee {name: 'John Doe'})-[:HAS_SKILL]->(s:Skill)
    RETURN s.name;

How do you represent a many-to-many relationship in database?

Here are the common ways to represent a many - to - many relationship in a database:

1. Using a Junction Table (Associative Table)

This is the most prevalent method in relational databases.

Suppose you have two entities that have a many - to - many relationship. For example, in a school database, “Students” and “Courses”. A student can enroll in multiple courses, and a course can have multiple students.

Step 2: Create the junction table

The junction table contains at least two foreign keys, each referencing the primary key of one of the related tables.

  • Table creation in SQL (for MySQL):
    -- Create the Students table
    CREATE TABLE Students (
    student_id INT PRIMARY KEY AUTO_INCREMENT,
    student_name VARCHAR(100)
    );

    -- Create the Courses table
    CREATE TABLE Courses (
    course_id INT PRIMARY KEY AUTO_INCREMENT,
    course_name VARCHAR(100)
    );

    -- Create the junction table (Enrollments)
    CREATE TABLE Enrollments (
    student_id INT,
    course_id INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES Students(student_id),
    FOREIGN KEY (course_id) REFERENCES Courses(course_id)
    );

In this example, the Enrollments table is the junction table. The combination of student_id and course_id forms a composite primary key, which ensures that each enrollment (a relationship between a student and a course) is unique.

Step 3: Insert and query data

  • Inserting data:

    -- Insert a student
    INSERT INTO Students (student_name) VALUES ('John Doe');
    -- Insert a course
    INSERT INTO Courses (course_name) VALUES ('Mathematics');
    -- Record the enrollment
    INSERT INTO Enrollments (student_id, course_id) VALUES (1, 1);
  • Querying data: To find all courses a student is enrolled in, or all students enrolled in a course, you can use JOIN operations.

    -- Find all courses John Doe is enrolled in
    SELECT Courses.course_name
    FROM Students
    JOIN Enrollments ON Students.student_id = Enrollments.student_id
    JOIN Courses ON Enrollments.course_id = Courses.course_id
    WHERE Students.student_name = 'John Doe';

2. In Non - Relational Databases

Graph Databases

  • In graph databases like Neo4j, a many - to - many relationship is represented by nodes and relationships. Each entity is a node, and the relationship between them is an edge.
  • For example, you can create Student nodes and Course nodes. Then, you can create a ENROLLED_IN relationship between the Student and Course nodes.
    // Create a student node
    CREATE (:Student {name: 'John Doe'})
    // Create a course node
    CREATE (:Course {name: 'Mathematics'})
    // Create the enrollment relationship
    MATCH (s:Student {name: 'John Doe'}), (c:Course {name: 'Mathematics'})
    CREATE (s)-[:ENROLLED_IN]->(c);

Document Databases

  • In document databases such as MongoDB, you can use arrays to represent many - to - many relationships in a denormalized way. For example, in the students collection, each student document can have an array of course IDs, and in the courses collection, each course document can have an array of student IDs. However, this approach can lead to data duplication and potential consistency issues.
    // Insert a student document
    db.students.insertOne({
    name: 'John Doe',
    courses: [ObjectId("1234567890abcdef12345678"), ObjectId("234567890abcdef12345678")]
    });
    // Insert a course document
    db.courses.insertOne({
    name: 'Mathematics',
    students: [ObjectId("abcdef1234567890abcdef12"), ObjectId("bcdef1234567890abcdef12")]
    });

Session15-TRANSACTION JPA

  1. What is “Offline Transaction”?
  2. How do we usually perform Transaction Management in JDBC?
  3. What is Database Transaction?
  4. What are entity states defined in Hibernate / JPA?
  5. How can we transfer the entity between different states?
  6. What are differences between save, persist?
  7. What are differences between update, merge and saveOrUpdate?
  8. How do you use elasticSearch in your java application

1. What is “Offline Transaction”?

An offline transaction in the context of databases is a set of operations on data that occur without an immediate, real - time connection to the database server. The operations are carried out on a local copy of the data, and the changes are later synchronized with the main database.

Example:

  • Mobile Banking App: A user opens a mobile banking app on their smartphone while on an airplane (no internet connection). They can view their account balance, transaction history which is stored locally. They can also initiate a new fund transfer. The app records this transfer request in a local database on the phone. Once the plane lands and the phone connects to the internet, the app synchronizes with the bank’s central database, uploading the new transfer request and downloading any new account updates.
  • Field Salesperson: A salesperson visits clients in an area with poor network coverage. Using a tablet, they access a local copy of the customer database. They add new customer details and record sales orders. Later, when they get back to an area with a network, the tablet syncs the new data with the company’s central database.

2. How do we usually perform Transaction Management in JDBC?

In JDBC (Java Database Connectivity), transaction management involves the following steps:

Step 1: Disable Auto - Commit Mode
By default, JDBC operates in auto - commit mode where each SQL statement is treated as a separate transaction. To group multiple statements into a single transaction, we need to disable auto - commit.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;

public class JDBCTransactionExample {
public static void main(String[] args) {
Connection connection = null;
try {
// Establish a connection
connection = DriverManager.getConnection("jdbc:mysql://localhost:3306/mydb", "user", "password");
// Disable auto - commit
connection.setAutoCommit(false);

Statement statement = connection.createStatement();
// Execute SQL statements
statement.executeUpdate("INSERT INTO employees (name, salary) VALUES ('John', 5000)");
statement.executeUpdate("UPDATE departments SET budget = budget - 5000 WHERE dept_name = 'IT'");

// Commit the transaction
connection.commit();
} catch (SQLException e) {
try {
if (connection != null) {
// Rollback the transaction in case of an error
connection.rollback();
}
} catch (SQLException ex) {
ex.printStackTrace();
}
e.printStackTrace();
} finally {
try {
if (connection != null) {
connection.close();
}
} catch (SQLException e) {
e.printStackTrace();
}
}
}
}

Explanation:

  • connection.setAutoCommit(false): Disables auto - commit mode so that statements are grouped into a single transaction.
  • connection.commit(): Commits all the statements in the transaction if everything goes well.
  • connection.rollback(): Rolls back all the statements in the transaction if an error occurs.

3. What is Database Transaction?

A database transaction is a sequence of one or more SQL statements that are treated as a single unit of work. It must satisfy the ACID properties:

  • Atomicity: Either all the statements in the transaction are executed successfully, or none of them are. For example, in a bank transfer, if you transfer money from one account to another, either both the debit from the source account and the credit to the destination account happen, or neither does.
  • Consistency: The transaction takes the database from one consistent state to another. For instance, if a rule in the database states that the total balance of all accounts should always be the same, a transaction should maintain this consistency.
  • Isolation: Transactions are isolated from each other. One transaction should not be affected by the intermediate states of other concurrent transactions. For example, if two users are trying to transfer money at the same time, their transactions should not interfere with each other.
  • Durability: Once a transaction is committed, its changes are permanent and will survive any subsequent system failures.

4. What are entity states defined in Hibernate / JPA?

In Hibernate and JPA (Java Persistence API), entities can be in one of the following states:

  • Transient: An entity is transient when it is created using the new keyword and has not been associated with a persistence context. It has no corresponding row in the database.

    // Transient entity
    Employee employee = new Employee();
    employee.setName("Jane");
  • Persistent: A persistent entity is associated with a persistence context and has a corresponding row in the database. Any changes made to a persistent entity will be automatically synchronized with the database when the transaction is committed.

    EntityManager entityManager = entityManagerFactory.createEntityManager();
    entityManager.getTransaction().begin();
    Employee employee = entityManager.find(Employee.class, 1L);
    // Now the employee is in persistent state
  • Detached: A detached entity was once persistent but is no longer associated with a persistence context. It still has a corresponding row in the database, but changes made to it will not be automatically synchronized.

    entityManager.getTransaction().commit();
    entityManager.close();
    // Now the employee is in detached state
  • Removed: An entity is in the removed state when it has been marked for deletion from the database. Once the transaction is committed, the corresponding row in the database will be deleted.

    entityManager.getTransaction().begin();
    Employee employee = entityManager.find(Employee.class, 1L);
    entityManager.remove(employee);
    // Now the employee is in removed state

5. How can we transfer the entity between different states?

  • Transient to Persistent: Use methods like persist() or save() in Hibernate. In JPA, you can use EntityManager.persist().

    EntityManager entityManager = entityManagerFactory.createEntityManager();
    entityManager.getTransaction().begin();
    Employee employee = new Employee();
    employee.setName("Tom");
    entityManager.persist(employee);
    // Now the employee is in persistent state
  • Persistent to Detached: Closing the EntityManager or clearing the persistence context will make a persistent entity detached.

    entityManager.getTransaction().commit();
    entityManager.close();
    // The previously persistent entity is now detached
  • Detached to Persistent: Use the merge() method in JPA.

    EntityManager newEntityManager = entityManagerFactory.createEntityManager();
    newEntityManager.getTransaction().begin();
    Employee detachedEmployee = getDetachedEmployee();
    Employee persistentEmployee = newEntityManager.merge(detachedEmployee);
    // Now the entity is back in persistent state
  • Persistent/Detached to Removed: Use the remove() method in JPA.

    entityManager.getTransaction().begin();
    Employee employee = entityManager.find(Employee.class, 1L);
    entityManager.remove(employee);
    // Now the employee is in removed state

6. What are differences between save, persist?

  • save() (Hibernate - specific):

    • Returns the generated identifier immediately. It can be used to insert a new entity into the database. If the entity is already persistent, it may throw an exception.
      Session session = sessionFactory.openSession();
      Transaction transaction = session.beginTransaction();
      Employee employee = new Employee();
      employee.setName("Alice");
      Serializable id = session.save(employee);
      transaction.commit();
      session.close();
  • persist() (JPA - standard):

    • Does not guarantee that the identifier will be assigned immediately. It is used to make a transient entity persistent. If the entity is already persistent, it will have no effect.
      EntityManager entityManager = entityManagerFactory.createEntityManager();
      entityManager.getTransaction().begin();
      Employee employee = new Employee();
      employee.setName("Bob");
      entityManager.persist(employee);
      entityManager.getTransaction().commit();
      entityManager.close();

7. What are differences between update, merge and saveOrUpdate?

  • update() (Hibernate - specific):

    • Used to make a detached entity persistent. If the entity is already persistent, it may throw an exception. It directly updates the database row corresponding to the entity.
      Session session = sessionFactory.openSession();
      Transaction transaction = session.beginTransaction();
      Employee detachedEmployee = getDetachedEmployee();
      session.update(detachedEmployee);
      transaction.commit();
      session.close();
  • merge() (JPA - standard):

    • Creates a copy of the detached entity, makes the copy persistent, and returns the persistent copy. The original detached entity remains detached. It can handle both transient and detached entities.
      EntityManager entityManager = entityManagerFactory.createEntityManager();
      entityManager.getTransaction().begin();
      Employee detachedEmployee = getDetachedEmployee();
      Employee mergedEmployee = entityManager.merge(detachedEmployee);
      entityManager.getTransaction().commit();
      entityManager.close();
  • saveOrUpdate() (Hibernate - specific):

    • Checks if the entity has an identifier. If it does not have an identifier, it acts like save(). If it has an identifier, it acts like update().
      Session session = sessionFactory.openSession();
      Transaction transaction = session.beginTransaction();
      Employee employee = new Employee();
      session.saveOrUpdate(employee);
      transaction.commit();
      session.close();

8. How do you use Elasticsearch in your Java application?

To use Elasticsearch in a Java application, you can follow these steps:

Step 1: Add Dependencies
If you are using Maven, add the following dependencies to your pom.xml:

<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.17.3</version>
</dependency>

Step 2: Create a Client

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;

public class ElasticsearchClientExample {
public static void main(String[] args) {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));
// Use the client for operations
try {
// Perform operations like indexing, searching, etc.
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}

Step 3: Index a Document

import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.common.xcontent.XContentType;

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

public class ElasticsearchIndexExample {
public static void main(String[] args) {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));

Map<String, Object> jsonMap = new HashMap<>();
jsonMap.put("title", "Elasticsearch Tutorial");
jsonMap.put("content", "Learn how to use Elasticsearch in Java");

IndexRequest request = new IndexRequest("my_index")
.id("1")
.source(jsonMap, XContentType.JSON);

try {
IndexResponse indexResponse = client.index(request, RequestOptions.DEFAULT);
System.out.println(indexResponse);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

Step 4: Search for Documents

import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.io.IOException;

public class ElasticsearchSearchExample {
public static void main(String[] args) {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));

SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("title", "Elasticsearch"));
searchRequest.source(searchSourceBuilder);

try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println(searchResponse);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

Session17-UNIT TEST

  1. Explain and name some methods that you used in JUnit.
  2. Explain and name some annotations that you used in JUnit.
  3. What is Mockito and the usage of it?

1. Commonly - Used Methods in JUnit

assertEquals()

  • Explanation: This method is used to verify if two values are equal. It is very useful when you want to check if the result of a method call in your code under test matches the expected result.
  • Example:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertEquals;

    public class CalculatorTest {
    @Test
    public void testAddition() {
    Calculator calculator = new Calculator();
    int result = calculator.add(2, 3);
    assertEquals(5, result);
    }
    }

    class Calculator {
    public int add(int a, int b) {
    return a + b;
    }
    }

assertTrue() and assertFalse()

  • Explanation: assertTrue() is used to verify if a given condition is true, and assertFalse() is used to verify if a condition is false. These are handy when you want to check the truth - value of a boolean expression returned by a method.
  • Example:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertTrue;
    import static org.junit.jupiter.api.Assertions.assertFalse;

    public class StringUtilTest {
    @Test
    public void testIsEmpty() {
    StringUtil stringUtil = new StringUtil();
    assertTrue(stringUtil.isEmpty(""));
    assertFalse(stringUtil.isEmpty("Hello"));
    }
    }

    class StringUtil {
    public boolean isEmpty(String str) {
    return str == null || str.length() == 0;
    }
    }

assertNull() and assertNotNull()

  • Explanation: assertNull() checks if an object reference is null, while assertNotNull() checks if an object reference is not null. They are useful when you need to ensure that a method returns or does not return a null value.
  • Example:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertNull;
    import static org.junit.jupiter.api.Assertions.assertNotNull;

    public class ObjectFactoryTest {
    @Test
    public void testCreateObject() {
    ObjectFactory objectFactory = new ObjectFactory();
    Object obj = objectFactory.createObject();
    assertNotNull(obj);
    Object nullObj = objectFactory.createNullObject();
    assertNull(nullObj);
    }
    }

    class ObjectFactory {
    public Object createObject() {
    return new Object();
    }

    public Object createNullObject() {
    return null;
    }
    }

2. Commonly - Used Annotations in JUnit

@Test

  • Explanation: This annotation is used to mark a method as a test method. JUnit will execute all methods annotated with @Test when running the test class.
  • Example:
    import org.junit.jupiter.api.Test;

    public class SimpleTest {
    @Test
    public void testSomething() {
    // Test logic here
    }
    }

@BeforeEach

  • Explanation: Methods annotated with @BeforeEach are executed before each test method. This is useful for setting up the test environment, such as initializing objects or variables that are needed for each test.
  • Example:
    import org.junit.jupiter.api.BeforeEach;
    import org.junit.jupiter.api.Test;

    public class UserServiceTest {
    private UserService userService;

    @BeforeEach
    public void setUp() {
    userService = new UserService();
    }

    @Test
    public void testCreateUser() {
    // Use userService for testing
    }
    }

    class UserService {
    // Class implementation
    }

@AfterEach

  • Explanation: Methods annotated with @AfterEach are executed after each test method. This is used for cleaning up resources, such as closing database connections or releasing memory.
  • Example:
    import org.junit.jupiter.api.AfterEach;
    import org.junit.jupiter.api.Test;
    import java.io.File;
    import java.io.FileWriter;
    import java.io.IOException;

    public class FileServiceTest {
    private File tempFile;

    @Test
    public void testWriteToFile() throws IOException {
    tempFile = new File("temp.txt");
    FileWriter writer = new FileWriter(tempFile);
    writer.write("Test data");
    writer.close();
    }

    @AfterEach
    public void tearDown() {
    if (tempFile != null && tempFile.exists()) {
    tempFile.delete();
    }
    }
    }

@BeforeAll and @AfterAll

  • Explanation: @BeforeAll is used to annotate a static method that will be executed once before all the test methods in the class. @AfterAll is used to annotate a static method that will be executed once after all the test methods in the class. These are useful for performing expensive setup and cleanup operations, like starting and stopping a database server.
  • Example:
    import org.junit.jupiter.api.BeforeAll;
    import org.junit.jupiter.api.AfterAll;
    import org.junit.jupiter.api.Test;

    public class DatabaseServiceTest {
    private static DatabaseService databaseService;

    @BeforeAll
    public static void setUpAll() {
    databaseService = new DatabaseService();
    databaseService.startDatabase();
    }

    @Test
    public void testQueryDatabase() {
    // Test database query
    }

    @AfterAll
    public static void tearDownAll() {
    databaseService.stopDatabase();
    }
    }

    class DatabaseService {
    public void startDatabase() {
    // Start database logic
    }

    public void stopDatabase() {
    // Stop database logic
    }
    }

3. What is Mockito and its Usage

Definition

Mockito is a popular open - source testing framework for Java that allows you to create mock objects. Mock objects are simulated objects that mimic the behavior of real objects in a controlled way. They are used to isolate the code under test from its dependencies, making unit tests more reliable and faster.

Common Usages

Creating Mock Objects

  • You can use Mockito.mock() to create a mock object of a class or an interface.
    import org.junit.jupiter.api.Test;
    import static org.mockito.Mockito.mock;

    public class MockitoExample {
    @Test
    public void testMockCreation() {
    MyInterface myMock = mock(MyInterface.class);
    // Now myMock is a mock object of MyInterface
    }
    }

    interface MyInterface {
    void doSomething();
    }

Stubbing Methods

  • Stubbing means defining the behavior of a method on a mock object. You can use methods like when() and thenReturn() to stub methods.
    import org.junit.jupiter.api.Test;
    import static org.mockito.Mockito.mock;
    import static org.mockito.Mockito.when;

    public class StubbingExample {
    @Test
    public void testStubbing() {
    MyService myService = mock(MyService.class);
    when(myService.getResult()).thenReturn(10);
    int result = myService.getResult();
    // result will be 10
    }
    }

    class MyService {
    public int getResult() {
    return 0;
    }
    }

Verifying Method Calls

  • You can use Mockito.verify() to check if a method on a mock object has been called with specific arguments.
    import org.junit.jupiter.api.Test;
    import static org.mockito.Mockito.mock;
    import static org.mockito.Mockito.verify;

    public class VerificationExample {
    @Test
    public void testVerification() {
    MyInterface myMock = mock(MyInterface.class);
    myMock.doSomething();
    verify(myMock).doSomething();
    }
    }

    interface MyInterface {
    void doSomething();
    }

Session19-MICROSERVICE

  1. In your own word, please describe some of the advantages and disadvantages of a
  2. Monolithic Application.
  3. In your own word, please describe some of the advantages and disadvantages of a
  4. Microservice Application.
  5. What is the purpose of using Netflix Eureka?
  6. How can microservices communicate with each other?
  7. What is the purpose of using Spring API Gateway?
  8. Explain cascading failure in microservice and how to prevent it.
  9. Explain CircuitBreaker and how it works in detail.

The following is an explanation of each question along with relevant examples:

Monolithic Application

  • Advantages
    • Simplicity: It’s a single unit, easy to develop, test and deploy. For example, a small blog website built with a monolithic architecture can be developed quickly as all the components are in one place.
    • Ease of Data Management: All components can access the same database easily, simplifying data consistency. In a monolithic e-commerce app, the product, order and user data can be managed centrally.
    • Good for Small Projects: Ideal for small-scale applications with low complexity and clear requirements. A simple internal management system for a small company may not need the complexity of a distributed architecture.
  • Disadvantages
    • Scalability Issues: As the application grows, it becomes hard to scale. If a monolithic social media app experiences a sudden traffic spike, scaling the entire application is more difficult and expensive than scaling individual components.
    • Slow Deployment: Any change requires redeploying the entire application. If you want to update a single feature in a monolithic banking app, the whole app needs to be deployed, causing potential downtime.
    • Technology Limitations: It’s hard to adopt new technologies or frameworks in a monolithic structure. For example, if you want to use a new data processing framework in a monolithic app that’s already using an old tech stack, it may require a major rewrite.

Microservice Application

  • Advantages
    • High Scalability: Each microservice can be scaled independently. In a large e-commerce platform like Amazon, the order processing, inventory management and user profile services can be scaled based on their specific load.
    • Technology Diversity: Different microservices can use different technologies based on their requirements. For example, the image processing microservice can use a different technology stack than the user authentication microservice.
    • Faster Deployment: Only the updated microservice needs to be deployed. If a new feature is added to the payment microservice of a fintech app, only that microservice is deployed, minimizing downtime.
  • Disadvantages
    • Complexity in Management: Managing multiple microservices, their communication and dependencies is complex. For example, coordinating data updates across multiple microservices in a healthcare application can be challenging.
    • Data Consistency: Ensuring data consistency across multiple microservices is difficult. In a microservices-based ride-hailing app, maintaining the consistency of driver and rider data across different services can be a problem.
    • Testing Complexity: Testing the entire system becomes more complex as it involves testing multiple microservices and their interactions. Testing a microservices-based logistics app requires testing each service and how they work together.

Netflix Eureka

  • Purpose: It’s a service discovery tool. It allows microservices in a distributed system to register and discover each other. For example, in a microservices architecture where there are multiple user service instances and order service instances, Eureka helps the order service find the available user service instances to communicate with.

Microservices Communication

  • Methods:
    • RESTful API: Microservices can communicate via HTTP requests using RESTful APIs. For example, a product service can expose a REST API that a shopping cart service can call to get product details.
    • Message Queues: They can use message queues like RabbitMQ or Kafka. For instance, in an e-commerce system, when an order is placed, the order service can send a message to a message queue, which the inventory service listens to and updates the inventory accordingly.

Spring API Gateway

  • Purpose: It acts as a single entry point for all microservices. It provides features like request routing, authentication, rate limiting, etc. For example, in a microservices-based application, all external requests first come to the API gateway, which then routes the requests to the appropriate microservices. It can also apply authentication and authorization rules before allowing the request to reach the microservices.

Cascading Failure in Microservice and Prevention

  • Explanation: In a microservices environment, if one microservice fails, it can cause other dependent microservices to fail, leading to a cascading effect. For example, if the user service in a social media app fails, the services that depend on it like the post service (which needs to get user information) and the comment service may also fail.
  • Prevention:
    • Circuit Breaker: Implementing circuit breakers can prevent cascading failures. If a microservice fails to respond after a certain number of attempts, the circuit breaker trips and stops sending requests to that service, preventing other services from waiting indefinitely and potentially failing.
    • Isolation: Using techniques like thread pools and resource isolation to ensure that the failure of one microservice doesn’t exhaust the resources of other services.

Circuit Breaker

  • Explanation: A circuit breaker is a design pattern used to prevent cascading failures in a microservices architecture. It monitors the health of a service and if the service fails to respond or returns errors frequently, the circuit breaker trips and stops sending requests to that service for a certain period.
  • How it Works:
    • Closed State: Initially, the circuit breaker is in the closed state and all requests are sent to the service as normal.
    • Open State: If the service fails a certain number of times within a given time period, the circuit breaker opens. In this state, all requests to the service are immediately failed without being sent to the actual service.
    • Half-Open State: After a certain period in the open state, the circuit breaker enters the half-open state. It allows a small number of requests to be sent to the service to check if it has recovered. If the requests succeed, the circuit breaker closes and normal operation resumes. If the requests fail, the circuit breaker returns to the open state.

Session23-DEVOPS

  1. Use your own words to explain Jenkins.
  2. Can you talk about CI/CD?
  3. Git command you used in the project
  4. How do you release from the git repository
  5. How do you combine several commits together
  6. What is git cherry-pick
  7. difference between git and svn
  8. difference git merge and rebase

    1. Jenkins

    Jenkins is an open - source automation server widely used in software development. Its main purpose is to automate various stages of the software development lifecycle, such as building, testing, and deploying applications.

How it works:
Jenkins has a web - based interface where you can create and configure jobs. A job in Jenkins can represent a specific task, like building a Java project or running a set of unit tests. You can define the steps of the job, including the commands to execute, the source code repositories to pull from, and the environment variables to use.

Example:
Suppose you are developing a Python web application. You can set up a Jenkins job to automatically pull the latest code from a Git repository, install the necessary Python dependencies, run unit tests, and then deploy the application to a staging server if the tests pass.

2. CI/CD

  • CI (Continuous Integration):
    • CI is a development practice where developers frequently integrate their code changes into a shared repository. Every time code is pushed to the repository, an automated build and test process is triggered. This helps to catch integration issues early in the development cycle.
    • Example: In a team of developers working on a mobile app, each developer may push their code changes to the main Git repository several times a day. A CI server (like Jenkins) then automatically builds the app from the latest code and runs unit and integration tests. If any tests fail, the developers are notified immediately.
  • CD (Continuous Delivery/Deployment):
    • Continuous Delivery is an extension of CI. It ensures that the software can be reliably released to production at any time. After the code passes the CI tests, it is automatically prepared for deployment, but the actual deployment to production may be a manual step.
    • Continuous Deployment takes it a step further and automatically deploys the software to production if it passes all the tests.
    • Example: For a web - based e - commerce application, with continuous delivery, once the code passes the CI tests, it is packaged and stored in a deployment artifact repository. A release manager can then decide when to deploy it to the production servers. In continuous deployment, the application is automatically deployed to production as soon as the tests pass.

3. Git commands used in a project

  • git clone: Used to create a local copy of a remote Git repository.
    • Example: git clone https://github.com/user/repo.git creates a local copy of the repo repository hosted on GitHub.
  • git add: Adds changes in the working directory to the staging area.
    • Example: git add src/main.py adds the changes made to the main.py file to the staging area.
  • git commit: Commits the changes from the staging area to the local repository with a descriptive message.
    • Example: git commit -m "Fixed a bug in the login function"
  • git push: Pushes the committed changes from the local repository to a remote repository.
    • Example: git push origin main pushes the changes from the local main branch to the main branch of the remote repository named origin.
  • git pull: Fetches and merges changes from a remote repository into the local repository.
    • Example: git pull origin main fetches the latest changes from the main branch of the origin remote repository and merges them into the local main branch.

4. Releasing from the Git repository

  • Create a Release Branch (Optional):
    • You can create a dedicated release branch from the main development branch (e.g., main or master). For example, git checkout -b release/v1.0 main creates a new release branch named release/v1.0 from the main branch.
  • Tag the Release:
    • Use the git tag command to mark a specific commit as a release. For example, git tag v1.0 tags the current commit as version v1.0. You can then push the tags to the remote repository using git push origin --tags.
  • Build and Deploy:
    • Use the tagged commit to build the application and deploy it to the appropriate environments (staging, production, etc.).

5. Combining several commits together

You can use the git rebase -i (interactive rebase) command to combine multiple commits.

  • Example: Suppose you have made 3 consecutive commits and want to combine them into one.
    • First, find the commit hash of the commit before the first commit you want to combine. Let’s say the commit hash is abc123.
    • Then run git rebase -i abc123. This will open an editor where you can see a list of commits.
    • Change the pick keyword to squash (or s) for the commits you want to combine with the previous one.
    • Save and close the editor. Git will then combine the commits, and you can provide a new commit message for the combined commit.

6. Git cherry - pick

git cherry - pick is used to apply a specific commit from one branch to another.

  • Example: Suppose you have a feature branch with a commit that you want to apply to the main branch.
    • First, switch to the main branch: git checkout main.
    • Then use git cherry - pick <commit - hash> where <commit - hash> is the hash of the commit on the feature branch that you want to apply. Git will then try to apply that commit to the main branch.

7. Difference between Git and SVN

  • Architecture:
    • Git: It is a distributed version control system. Every developer has a complete copy of the repository, including the entire commit history. This allows developers to work offline and perform most operations locally.
    • SVN: It is a centralized version control system. There is a single central repository, and developers need to connect to it to perform operations like committing changes or getting the latest code.
  • Branching and Merging:
    • Git: Branching and merging are very fast and easy. Creating a new branch is just a matter of creating a new pointer to a commit. Merging between branches is also efficient.
    • SVN: Branching and merging can be more complex and slower. It involves copying the entire directory structure in the repository to create a branch.
  • Data Integrity:
    • Git: It uses a hash - based system to ensure data integrity. Every commit, file, and directory has a unique hash, and any change to the data will result in a different hash.
    • SVN: While it also has some mechanisms for data integrity, it is not as robust as Git’s hash - based system.

8. Difference between Git merge and rebase

  • Merge:
    • A git merge combines the changes from two or more branches into one. It creates a new “merge commit” that has two parents, one from each branch being merged.
    • Example: If you have a feature branch and a main branch, and you want to integrate the changes from the feature branch into the main branch, you can run git checkout main followed by git merge feature. This will create a merge commit on the main branch.
    • The commit history after a merge shows a more complex, branching structure.
  • Rebase:
    • A git rebase moves or combines a sequence of commits to a new base commit. It takes the commits from one branch and replays them on top of another branch.
    • Example: If you have a feature branch and a main branch, and you want to update the feature branch with the latest changes from the main branch, you can run git checkout feature followed by git rebase main. This will take the commits from the feature branch and replay them on top of the latest commit on the main branch.
    • The commit history after a rebase is linear, which can make it easier to understand and follow. However, it can also be more complex to resolve conflicts during a rebase compared to a merge.

Session25-CLOUD

  1. AWS difference between parameter store and secret manager
  2. AWS where to store certificate file
  3. extra:(those we are not sure which session to put in)
  4. Use your own words to explain TDD and why use TDD.
  5. Please do some research on Redis and use your own words to explain what Redis is.
  6. Use your own words to explain what Swagger is.
  7. Please do some research on ELK and use your own words to explain what they are.
  8. Use your own words to explain Jira.
  9. What is RabbitMQ and what can it help us to achieve in a web application?What are the component of RabbitMQ?
  10. What are different types of Exchange that exist in RabbitMQ?
  11. What is Scheduler and what can it help us to achieve in a web application?

1. AWS: Difference between Parameter Store and Secret Manager

Parameter Store

  • Explanation: AWS Systems Manager Parameter Store is a service that allows you to store configuration data such as database connection strings, API keys, and other parameters in a hierarchical structure. It’s designed for storing both plain - text and encrypted data. It helps in centralizing configuration management, making it easier to manage and update application settings across multiple environments.
  • Example: Suppose you have a microservices - based application deployed in multiple AWS regions. You can store the database connection strings for each region in the Parameter Store. For instance, a key - value pair like /myapp/production/db - connection - string with the actual connection string as the value. When your application starts, it can retrieve the appropriate connection string from the Parameter Store based on the environment.

Secret Manager

  • Explanation: AWS Secrets Manager is focused on securely managing secrets such as passwords, access keys, and other sensitive information. It provides features like automatic rotation of secrets, auditing, and fine - grained access control. It’s designed to reduce the risk of exposing sensitive data and simplify the process of keeping secrets up - to - date.
  • Example: Consider an application that uses an Amazon RDS database. You can store the database password in the Secrets Manager. The application can then retrieve the password securely when it needs to connect to the database. Additionally, you can set up automatic rotation of the password every 30 days, which helps in maintaining security.

2. AWS: Where to store certificate files

  • AWS Certificate Manager (ACM):
    • ACM is the recommended service for managing SSL/TLS certificates in AWS. It provides free SSL/TLS certificates for use with AWS services such as Elastic Load Balancing, Amazon CloudFront, and API Gateway. You can easily request, renew, and manage certificates through the ACM console or API.
    • Example: If you have a web application running behind an Elastic Load Balancer, you can request an SSL/TLS certificate from ACM and associate it with the load balancer. This enables secure communication between clients and your application.
  • AWS S3:
    • You can also store certificate files in an Amazon S3 bucket. However, you need to ensure proper security measures such as encryption and access control. This option is useful if you need to use the certificates with non - AWS services or if you want to have more control over the storage and management of the certificates.
    • Example: If you have an on - premise server that needs to use an SSL/TLS certificate stored in AWS, you can store the certificate in an S3 bucket and download it to the server when needed.

3. TDD (Test - Driven Development)

  • Explanation: TDD is a software development process where you write tests before writing the actual production code. The process typically follows a cycle of “Red - Green - Refactor”. First, you write a test that initially fails (Red). Then, you write the minimum amount of code to make the test pass (Green). Finally, you refactor the code to improve its design, readability, and maintainability without changing its behavior.
  • Why use TDD:
    • Early Bug Detection: By writing tests first, you can catch bugs early in the development process, reducing the cost of fixing them later.
    • Improved Design: TDD encourages writing modular and testable code, which leads to better software design.
    • Documentation: Tests serve as living documentation for the code, making it easier for other developers to understand how the code works.
  • Example: Suppose you are developing a simple calculator class with an add method. First, you write a test like this in Java using JUnit:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertEquals;

    public class CalculatorTest {
    @Test
    public void testAdd() {
    Calculator calculator = new Calculator();
    int result = calculator.add(2, 3);
    assertEquals(5, result);
    }
    }

This test will initially fail because the Calculator class and the add method don’t exist yet. Then you write the minimum code to make the test pass:

public class Calculator {
public int add(int a, int b) {
return a + b;
}
}

Finally, you can refactor the code if needed, for example, by adding more error handling or improving the code style.

4. Redis

  • Explanation: Redis is an open - source, in - memory data structure store that can be used as a database, cache, and message broker. It supports various data structures such as strings, hashes, lists, sets, and sorted sets. Redis is known for its high performance because it stores data in memory, which allows for very fast read and write operations. It also provides features like data persistence, replication, and clustering.
  • Example: In a web application, Redis can be used as a cache to store frequently accessed data. For example, if you have a news website, you can store the top - viewed articles in Redis. When a user requests the top - viewed articles page, the application first checks Redis. If the data is available in Redis, it can be returned immediately, reducing the load on the database.

5. Swagger

  • Explanation: Swagger is a set of open - source tools and frameworks for designing, building, documenting, and consuming RESTful APIs. It provides a way to describe the structure of an API using a JSON or YAML - based specification. Swagger tools can then generate interactive documentation, client libraries, and server stubs based on the API specification.
  • Example: Suppose you have a RESTful API for a book management system. You can use Swagger to define the API endpoints, request and response formats, and available operations. Swagger UI can then generate an interactive documentation page where developers can explore the API, test the endpoints, and see the expected input and output formats.

6. ELK (Elasticsearch, Logstash, Kibana)

  • Explanation:
    • Elasticsearch: It is a distributed, open - source search and analytics engine. It stores data in a JSON - like format and allows for fast and flexible searching, filtering, and aggregation of data. It can handle large volumes of data and is often used for log analysis, full - text search, and real - time analytics.
    • Logstash: Logstash is a data processing pipeline that collects, filters, and transforms data from various sources (such as log files, system metrics, and application events) and sends it to a destination (such as Elasticsearch). It can perform tasks like parsing log data, enriching it with additional information, and cleaning it up.
    • Kibana: Kibana is a web - based visualization tool that works with Elasticsearch. It allows users to create visualizations, dashboards, and reports based on the data stored in Elasticsearch. It provides an intuitive interface for exploring and analyzing data.
  • Example: In a large - scale web application, Logstash can collect all the application logs from different servers. It can then parse the logs, extract relevant information such as request URLs, response times, and error messages. The processed data is sent to Elasticsearch for storage. Developers and administrators can then use Kibana to create dashboards showing the application’s performance metrics, error rates, and other important information.

7. Jira

  • Explanation: Jira is a popular project management and issue - tracking tool developed by Atlassian. It allows teams to plan, track, and manage projects, tasks, and bugs. Jira provides features such as customizable workflows, issue tracking, reporting, and integration with other tools. It can be used for software development projects, but also for other types of projects in different industries.
  • Example: In a software development team, Jira can be used to manage the development lifecycle of a project. Developers can create issues for new features, bugs, and tasks. The project manager can assign these issues to team members, set deadlines, and track the progress of each issue. Jira also provides reports on the project’s status, such as the number of open and closed issues, the time taken to resolve issues, and the overall project progress.

8. RabbitMQ

  • Explanation: RabbitMQ is an open - source message broker software that implements the Advanced Message Queuing Protocol (AMQP). It enables applications to communicate with each other by sending and receiving messages. It acts as an intermediary between producers (applications that send messages) and consumers (applications that receive messages).
  • What it can help achieve in a web application:
    • Decoupling: It allows different components of a web application to be decoupled. For example, in an e - commerce application, the order processing component can send messages to the inventory management component through RabbitMQ without having direct knowledge of the inventory system.
    • Asynchronous Processing: It enables asynchronous processing, which can improve the performance and scalability of the application. For instance, when a user submits a form, the application can send a message to RabbitMQ and continue processing other tasks without waiting for the form data to be fully processed.
  • Components of RabbitMQ:
    • Producer: An application that sends messages to a RabbitMQ broker.
    • Consumer: An application that receives messages from a RabbitMQ broker.
    • Queue: A buffer that stores messages until they are consumed.
    • Exchange: Routes messages to one or more queues based on rules.
    • Broker: The RabbitMQ server that manages the queues, exchanges, and message routing.

9. Different types of Exchanges in RabbitMQ

  • Direct Exchange: Routes messages to queues based on the message’s routing key. Each queue is bound to the direct exchange with a specific routing key. When a message is sent to the direct exchange with a certain routing key, it is delivered to the queues that are bound with the same routing key.
    • Example: In a logging application, different types of logs (e.g., error logs, warning logs) can be sent to different queues using a direct exchange.
  • Fanout Exchange: Routes messages to all the queues that are bound to it, regardless of the routing key. It is useful when you want to broadcast messages to multiple consumers.
    • Example: In a news application, when a new news article is published, a message can be sent to a fanout exchange, and all the queues (e.g., queues for different user groups) bound to the exchange will receive the message.
  • Topic Exchange: Routes messages to queues based on a pattern matching of the routing key. Queues are bound to the topic exchange with a binding key that can contain wildcards (* for single - word matching and # for multi - word matching).
    • Example: In a financial application, messages about different stocks can be sent to a topic exchange. A queue can be bound to the exchange with a binding key like stocks.# to receive all messages related to stocks.
  • Headers Exchange: Routes messages based on the message headers rather than the routing key. Queues are bound to the headers exchange with a set of header values. When a message is sent with specific headers, it is delivered to the queues that match the header values.

10. Scheduler

  • Explanation: A scheduler in a web application is a component that allows you to schedule tasks to run at specific times or intervals. It can be used to perform various tasks such as running batch jobs, sending periodic notifications, and refreshing caches.
  • What it can help achieve in a web application:
    • Automation: It automates repetitive tasks, reducing the need for manual intervention. For example, a scheduler can be used to automatically generate daily reports in a business application.
    • Resource Optimization: It can be used to schedule resource - intensive tasks during off - peak hours to optimize the use of system resources. For instance, a scheduler can be used to perform database backups at night when the application has low traffic.
  • Example: In a content management system, a scheduler can be used to publish new articles at a specific time. The administrator can set a publication time for an article, and the scheduler will ensure that the article is made available to the public at the specified time.