Contractor knowledge syllabus

. . .

Syllabus

Session Topic Detailed Topics
1 JVM STRING FINAL 1. Warm Up
2. JVM Memory Management
3. JVM, JDK, JRE
4. Garbage Collection
5. String & StringBuilder & StringBuffer
6. Final, Finally, Finalize
7. Immutable class (optional: basic syntax of java)
2 STATIC OOP 1. Static
2. Marker Interface - Serializable, Cloneable
3. OOP
4. SOLID Principle
5. Reflection
6. Generics
3 COLLECTION 1. Array vs ArrayList vs LinkedList
2. Set, TreeSet, LinkedHashSet
3. Map, LinkedHashMap, ConcurrentHashMap(how it works)
4. SynchronizedMap
5. Iterator vs Enumeration
4 EXCEPTION DESIGN PATTERN 1. Design Pattern - Singleton, Factory, Observer, Proxy
2. Exception Type - compile, runtime, customized
5 THREADS 1. MultiThreads Interaction (Synchronized, Atomic, ThreadLocal, Volatile)
2. Reentrant Lock
3. Executor and ThreadPool, ForkJoinPool
4. Future & CompletableFuture
5. Runnable vs Callable
6. Semaphore vs Mutex
6 JAVA8,17 1. Java 8: Functional Interface, Lambda, Stream API (map, filter, sorted, groupingBy etc), Optional, Default
2. Java 17: Sealed Class, advantage vs limitation, across package
7 SQL 1. Primary Key, Normalization
2. Different type of Joins
3. Top asked SQLs - nth highest salary; highest salary each department; employee salary greater than manager
4. Introduce of Stored Procedure and Function
5. Cluster index vs Non - Cluster - Index
6. Explain Plan - what does it do, what can it tell
8 NOSQL 1. SQL vs NoSQL
2. MongoDB vs Cassandra introduction
3. ACID vs CAP rules explanation
9 REST API 1. DispatcherServlet
2. Rest API
3. How to create a good rest api
4. Http Error Code: 200, 201, 400, 401, 403, 404, 500, 502, 503, 504
5. Introduction of GraphQL, WebSocket, gRPC
6. ReactiveJava
10 SPRING CORE 1. IOC/DI
2. Bean Scope
3. Constructor vs Setter vs Field based injection
11 SPRING ANNOTATIONS 1. Different spring annotations
2. @Controller vs @RestController
3. @Qualifier, @Primary
4. Spring Cache and Retry
12 SPRING BOOT 1. How to create spring boot from Scratch
2. Benefit of Spring boot
3. Annotation @SpringBootApplication
4. AutoConfiguration, how to disable
5. Actuator
13 SPRING BOOT2 1. Spring ActiveProfile
2. AOP
3. @ExceptionHandler, @ControllerAdvice
14 DATA ACCESS 1. JDBC, statement vs PreparedStatement, Datasource
2. Hibernate ORM, Session, Cache
3. Optimistic Locking - add version column
4. Association: many - to - many
15 TRANSACTION JPA 1. @Transactional - atomic operation
2. Propagation, Isolation
3. JPA naming convention
4. Paging and Sorting Using JPA
5. Hibernate Persistence Context
16 SECURITY 1. How to implement Security by overriding Spring class
2. Basic Authentication and password encryption
3. JWT Token and workflow
4. Oauth2 workflow
5. Authorization based on User role
17 UNIT TEST 1. Different Type of Tests in whole project lifecycle
2. Unit Test, Mock
3. Testing Rest Api with Rest Assured
18 AUTOMATION TEST 1. BDD - Cucumber - annotations
2. Load Test with JMeter
3. Performance tool JProfiler
4. AB Test
19 MICROSERVICE 1. Benefits/Disadvantage of MicroService
2. How to split monolithic to microservice
3. Circuit Breaker - concept, retry, fallback method
4. Load Balancer - concept and algorithms
5. API Gateway
6. Config Server
20 KAFKA 1. Kafka - concepts, how it works and how message is sent to partition
2. Consumer Group, assignment strategy
3. Message in Order
21 KAFKA2 1. Kafka Duplicate Message
2. Kafka Message Loss
3. Poison Failure, DLQ
4. Kafka Security (SASL, ACLs, Encrypt etc)
22 DISTRIBUTED SYSTEM 1. MicroService: how to communicate between services
2. Saga Pattern
3. Monitoring: Splunk, Grafana, Kabana, CloudWatch etc
4. System Design: distributed system
23 DEVOPS 1. CICD
2. Jenkins pipeline with example
3. Git Commands: squash, cherry - pick etc
4. On - Call: PageDuty etc
5. How do you solve a production issue with or without log
24 KUBERNETES 1. Kubernetes, EKS, WCNP, KubeCtl
25 CLOUD AWS Modules with examples

Optional

在 Java 中,Optional 是 Java 8 引入的一个容器类,它可以包含一个非空值(Optional.of(value)),也可以表示一个空值(Optional.empty())。其主要作用是避免 NullPointerException,使代码更具可读性和健壮性。下面详细介绍 Optional 的常见用法:

1. 创建 Optional 对象

  • Optional.of(T value):创建一个包含非空值的 Optional 对象。如果传入的参数为 null,会抛出 NullPointerException
  • Optional.ofNullable(T value):创建一个可能包含空值的 Optional 对象。如果传入的参数为 null,则返回一个空的 Optional 对象。
  • Optional.empty():创建一个空的 Optional 对象。
import java.util.Optional;

public class OptionalCreationExample {
public static void main(String[] args) {
// 使用 Optional.of 创建包含非空值的 Optional 对象
String nonNullValue = "Hello";
Optional<String> optionalWithValue = Optional.of(nonNullValue);

// 使用 Optional.ofNullable 创建可能包含空值的 Optional 对象
String nullableValue = null;
Optional<String> optionalWithNullableValue = Optional.ofNullable(nullableValue);

// 使用 Optional.empty 创建空的 Optional 对象
Optional<String> emptyOptional = Optional.empty();
}
}

2. 判断 Optional 中是否包含值

  • isPresent():判断 Optional 对象是否包含非空值。如果包含非空值返回 true,否则返回 false
  • isEmpty():Java 11 引入的方法,判断 Optional 对象是否为空。如果为空返回 true,否则返回 false
import java.util.Optional;

public class OptionalIsPresentExample {
public static void main(String[] args) {
Optional<String> optionalWithValue = Optional.of("Hello");
Optional<String> emptyOptional = Optional.empty();

System.out.println("optionalWithValue 是否包含值: " + optionalWithValue.isPresent());
System.out.println("emptyOptional 是否包含值: " + emptyOptional.isPresent());

System.out.println("optionalWithValue 是否为空: " + optionalWithValue.isEmpty());
System.out.println("emptyOptional 是否为空: " + emptyOptional.isEmpty());
}
}

3. 获取 Optional 中的值

  • get():如果 Optional 包含非空值,则返回该值;否则抛出 NoSuchElementException
  • orElse(T other):如果 Optional 包含非空值,则返回该值;否则返回指定的默认值 other
  • orElseGet(Supplier<? extends T> other):如果 Optional 包含非空值,则返回该值;否则调用 Supplier 函数式接口的 get() 方法获取默认值。
  • orElseThrow():如果 Optional 包含非空值,则返回该值;否则抛出 NoSuchElementException。在 Java 10 及以后版本,还可以使用 orElseThrow(Supplier<? extends X> exceptionSupplier) 方法自定义异常。
import java.util.Optional;

public class OptionalGetValueExample {
public static void main(String[] args) {
Optional<String> optionalWithValue = Optional.of("Hello");
Optional<String> emptyOptional = Optional.empty();

// 使用 get 方法
System.out.println("optionalWithValue 的值: " + optionalWithValue.get());
// 下面这行代码会抛出 NoSuchElementException
// System.out.println("emptyOptional 的值: " + emptyOptional.get());

// 使用 orElse 方法
String valueWithDefault = emptyOptional.orElse("Default Value");
System.out.println("emptyOptional 的默认值: " + valueWithDefault);

// 使用 orElseGet 方法
String valueWithSupplier = emptyOptional.orElseGet(() -> "Value from Supplier");
System.out.println("emptyOptional 通过 Supplier 获取的默认值: " + valueWithSupplier);

// 使用 orElseThrow 方法
try {
emptyOptional.orElseThrow();
} catch (Exception e) {
System.out.println("捕获到异常: " + e.getClass().getName());
}

// 使用 orElseThrow 自定义异常
try {
emptyOptional.orElseThrow(() -> new RuntimeException("Value is absent"));
} catch (Exception e) {
System.out.println("捕获到自定义异常: " + e.getMessage());
}
}
}

4. 对 Optional 中的值进行操作

  • ifPresent(Consumer<? super T> action):如果 Optional 包含非空值,则执行指定的 Consumer 操作;否则不做任何处理。
  • ifPresentOrElse(Consumer<? super T> action, Runnable emptyAction):Java 9 引入的方法,如果 Optional 包含非空值,则执行 action;否则执行 emptyAction
  • map(Function<? super T,? extends U> mapper):如果 Optional 包含非空值,则对该值应用 Function 函数式接口的 apply() 方法,并返回一个包含结果的 Optional 对象;否则返回一个空的 Optional 对象。
  • flatMap(Function<? super T, Optional<U>> mapper):与 map 方法类似,但 Function 函数式接口的返回值必须是 Optional 类型。
  • filter(Predicate<? super T> predicate):如果 Optional 包含非空值,并且该值满足指定的 Predicate 条件,则返回包含该值的 Optional 对象;否则返回一个空的 Optional 对象。
import java.util.Optional;

public class OptionalOperateValueExample {
public static void main(String[] args) {
Optional<String> optionalWithValue = Optional.of("Hello");
Optional<String> emptyOptional = Optional.empty();

// 使用 ifPresent 方法
optionalWithValue.ifPresent(value -> System.out.println("值存在,执行操作: " + value));
emptyOptional.ifPresent(value -> System.out.println("值存在,执行操作: " + value));

// 使用 ifPresentOrElse 方法
optionalWithValue.ifPresentOrElse(
value -> System.out.println("值存在,执行操作: " + value),
() -> System.out.println("值不存在,执行空操作")
);
emptyOptional.ifPresentOrElse(
value -> System.out.println("值存在,执行操作: " + value),
() -> System.out.println("值不存在,执行空操作")
);

// 使用 map 方法
Optional<Integer> lengthOptional = optionalWithValue.map(String::length);
System.out.println("值的长度: " + lengthOptional.orElse(0));

// 使用 flatMap 方法
Optional<Optional<Integer>> nestedOptional = optionalWithValue.map(s -> Optional.of(s.length()));
Optional<Integer> flatMappedOptional = optionalWithValue.flatMap(s -> Optional.of(s.length()));
System.out.println("flatMap 后的结果: " + flatMappedOptional.orElse(0));

// 使用 filter 方法
Optional<String> filteredOptional = optionalWithValue.filter(s -> s.length() > 3);
System.out.println("过滤后的结果: " + filteredOptional.orElse("未找到符合条件的值"));
}
}

通过上述示例,你可以了解到 Optional 的常见用法,合理使用 Optional 可以有效避免 NullPointerException,提高代码的健壮性和可读性。

Kubernetes

pod vs node in Kubernetes

本文由 简悦 SimpRead 转码, 原文地址 www.cloudzero.com

Kubernetes pods, nodes, and clusters get mixed up. Here’s a guide for beginners or if you just need t……

July 19, 2024 , 10 min read

Kubernetes pods, nodes, and clusters get mixed up. Here’s a simple guide for beginners or if you just need to reaffirm your knowledge of Kubernetes components.

alt text

Kubernetes is increasingly becoming the standard way to deploy, run, and maintain cloud-native applications that run inside containers. Kubernetes (K8s) automates most container management tasks, empowering engineers to manage high-performing, modern applications at scale.

Meanwhile, several surveys, including those from VMware and Gartner, suggest that inadequate expertise with Kubernetes has held back organizations from fully adopting containerization. So, maybe you’re wondering how Kubernetes components work.

In that case, we’ve put together a bookmarkable guide on pods, nodes, clusters, and more. Let’s dive right in, starting with the very reason Kubernetes exists; containers.

Quick Summary

 

Pod

Node

Cluster

Description

The smallest deployable unit in a Kubernetes cluster

A physical or virtual machine

A grouping of multiple nodes in a Kubernetes environment

Role

Isolates containers from underlying servers to boost portability

Provides the resources and instructions for how to run containers optimally

Provides the compute resources (CPU, volumes, etc) to run containerized apps

Has the control plane to orchestrate containerized apps through nodes and pods

What it hosts

Application containers, supporting volumes, and similar IP addresses for logically similar containers

Pods with application containers inside them, kubelet

Nodes containing the pods that host the application containers, control plane, kube-proxy, etc

What Is A Container?

In software engineering, a container is an executable unit of software that packages and runs an entire application, or portions of it, within itself.

Containers comprise not only the application’s binary files, but also libraries, runtimes, configuration files, and any other dependencies that the application requires to run optimally. Talk about self-sufficiency.

alt text

Credit: Containers vs virtual machine architectures

This design enables a container to be an entire application runtime environment unto itself.

As a result, a container isolates the application it hosts from the external environment it runs on. This enables applications running in containers to be built in one environment and deployed in different environments without compatibility problems.

Also, because containers share resources and do not host their own operating system, they are leaner than virtual machines (VMs). This makes deploying containerized applications much quicker and more efficient than on contemporary virtual machines.

What Is A Containerized Application?

In cloud computing, a containerized application refers to an app that has been specially built using cloud-native architecture for running within containers. A container can either host an entire application or small, distributed portions of it (which are known as microservices).

Developing, packaging, and deploying applications in containers is referred to as containerization. Apps that are containerized can run in a variety of environments and devices without causing compatibility problems.

One more thing. Developers can isolate faulty containers and fix them independently before they affect the rest of the application or cause downtime. This is something that is extremely tricky to do with traditional monolithic applications.

What Is A Kubernetes Pod?

A Kubernetes pod is a collection of one or more application containers.

The pod is an additional level of abstraction that provides shared storage (volumes), IP address, communication between containers, and hosts other information about how to run application containers. Check this out:

alt text
Credit: Kubernetes Pods architecture by Kubernetes.io

So, containers do not run directly on virtual machines and pods are a way to turn containers on and off.

Containers that must communicate directly to function are housed in the same pod. These containers are also co-scheduled because they work within a similar context. Also, the shared storage volumes enable pods to last through container restarts because they provide persistent data.

Kubernetes also scales or replicates the number of pods up and down to meet changing load/traffic/demand/performance requirements. Similar pods scale together.

Another unique feature of Kubernetes is that rather than creating containers directly, it generates pods that already have containers.

Also, whenever you create a K8s pod, the platform automatically schedules it to run on a Node. This pod will remain active until the specific process completes, resources to support the pod run out, the pod object is removed, or the host node terminates or fails.

Each pod runs inside a Kubernetes node, and each pod can fail over to another, logically similar pod running on a different node in case of failure. And speaking of Kubernetes nodes.

What Is A Kubernetes Node?

A Kubernetes node is either a virtual or physical machine that one or more Kubernetes pods run on. It is a worker machine that contains the necessary services to run pods, including the CPU and memory resources they need to run.

Now, picture this:

alt text

Credit: How Kubernetes Nodes work by Kubernetes.io

Each node also comprises three crucial components:

  • Kubelet – This is an agent that runs inside each node to ensure pods are running properly, including communications between the Master and nodes.
  • Container runtime – This is the software that runs containers. It manages individual containers, including retrieving container images from repositories or registries, unpacking them, and running the application.
  • Kube-proxy – This is a network proxy that runs inside each node, managing the networking rules within the node (between its pods) and across the entire Kubernetes cluster.

Here’s what a Cluster is in Kubernetes.

What Is A Kubernetes Cluster?

-

Nodes usually work together in groups. A Kubernetes cluster contains a set of work machines (nodes). The cluster automatically distributes workload among its nodes, enabling seamless scaling.

Here’s that symbiotic relationship again.

A cluster consists of several nodes. The node provides the compute power to run the setup. It can be a virtual machine or a physical machine. A single node can run one or more pods.

Each pod contains one or more containers. A container hosts the application code and all the dependencies the app requires to run properly.

Something else. The cluster also comprises the Kubernetes Control Plane (or Master), which manages each node within it. The control plane is a container orchestration layer where K8s exposes the API and interfaces for defining, deploying, and managing containers’ lifecycles.

The master assesses each node and distributes workloads according to available nodes. This load balancing is automatic, ensures efficiency in performance, and is one of the most popular features of Kubernetes as a container management platform.

You can also run the Kubernetes cluster on different providers’ platforms, such as Amazon’s Elastic Kubernetes Service (EKS), Microsoft’s Azure Kubernetes Service (AKS), or the Google Kubernetes Engine (GKE).

Take The Next Step: View, Track, And Control Your Kubernetes Costs With Confidence

Open-source, highly scalable, and self-healing, Kubernetes is a powerful platform for managing containerized applications. But as Kubernetes components scale to support business growth, Kubernetes cost management tends to get blindsided.

Most cost tools only display your total cloud costs, not how Kubernetes containers contributed. With CloudZero, you can view Kubernetes costs down to the hour as well as by K8s concepts such as, cost per pod, container, microservice, namespace, and cluster costs.

alt text

By drilling down to this level of granularity, you are able to find out what people, products, and processes are driving your Kubernetes spending.

You can also combine your containerized and non-containerized costs to simplify your analysis. CloudZero enables you to understand your Kubernetes costs alongside your AWS, Azure, Google Cloud, Snowflake, Databricks, MongoDB, and New Relic spend. Getting the full picture.

You can then decide what to do next to optimize the cost of your containerized applications without compromising performance. CloudZero will even alert you when cost anomalies occurs before you overspend.

to see these CloudZero Kubernetes Cost Analysis capabilities and more!

Kubernetes FAQ

Is a Kubernetes Pod a Container?

Yes, a Kubernetes pod is a group of one or more containers that share storage and networking resources. Pods are the smallest deployable units in Kubernetes and manage containers collectively, allowing them to run in a shared context with shared namespaces.

What is the difference between container node and pod?

A node is a worker machine in Kubernetes, part of a cluster, that runs containers and other Kubernetes components. A pod, on the other hand, is a higher-level abstraction that encapsulates one or more containers and their shared resources, managed collectively within a node.

Can a pod have multiple containers?

Yes, a Kubernetes pod can have multiple containers. Pods are designed to encapsulate closely coupled containers that need to share resources and communicate with each other over localhost. This approach facilitates running multiple containers within the same pod while treating them as a cohesive unit for scheduling, scaling, and management within the Kubernetes cluster.

How many pods run on a node?

The number of Kubernetes pods that can run on a node depends on various factors such as the node’s resources (CPU, memory, etc.), the resource requests and limits set by the pods, and any other applications or system processes running on the node.

Generally, a node can run multiple pods, and the Kubernetes scheduler determines pod placement based on available resources and scheduling policies defined in the cluster configuration.

Stream API

Java 的 Stream API 提供了多种类型的操作,可分为中间操作和终端操作。下面为你按照不同类型的操作给出示例。

中间操作

中间操作会返回一个新的流,允许你进行链式调用。常见的中间操作有 filtermapflatMapdistinctsorted 等。

终端操作

终端操作会触发流的处理并产生结果,常见的终端操作有 forEachcollectreducecountfindFirstanyMatch 等。

以下是不同类型操作的示例代码:

代码解释

  • 中间操作

    • filter:用于筛选出满足指定条件的元素。
    • map:将流中的每个元素映射为另一个元素。
    • flatMap:将嵌套的流扁平化。
    • distinct:去除流中的重复元素。
    • sorted:对流中的元素进行排序。
  • 终端操作

    • forEach:遍历流中的每个元素。
    • collect:将流中的元素收集到一个集合中。
    • reduce:对流中的元素进行归约操作,例如求和。
    • count:统计流中元素的数量。
    • findFirst:查找流中的第一个元素。
    • anyMatch:判断流中是否有元素满足指定条件。

通过这些示例,你可以了解到 Stream API 不同类型操作的使用方法和效果。

import java.util.Arrays;
import java.util.List;
import java.util.Optional;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class StreamAPIExamples {
public static void main(String[] args) {
List<Integer> numbers = Arrays.asList(1, 2, 2, 3, 4, 5, 6, 7, 8, 9, 10);

// 中间操作示例

// 1. filter: 过滤出偶数
System.out.println("filter 操作示例:");
List<Integer> evenNumbers = numbers.stream()
.filter(n -> n % 2 == 0)
.collect(Collectors.toList());
System.out.println("偶数列表: " + evenNumbers);

// 2. map: 将每个数字映射为其平方
System.out.println("\nmap 操作示例:");
List<Integer> squaredNumbers = numbers.stream()
.map(n -> n * n)
.collect(Collectors.toList());
System.out.println("平方列表: " + squaredNumbers);

// 3. flatMap: 将嵌套列表扁平化
System.out.println("\nflatMap 操作示例:");
List<List<Integer>> nestedList = Arrays.asList(
Arrays.asList(1, 2),
Arrays.asList(3, 4),
Arrays.asList(5, 6)
);
List<Integer> flattenedList = nestedList.stream()
.flatMap(List::stream)
.collect(Collectors.toList());
System.out.println("扁平化后的列表: " + flattenedList);

// 4. distinct: 去除重复元素
System.out.println("\ndistinct 操作示例:");
List<Integer> distinctNumbers = numbers.stream()
.distinct()
.collect(Collectors.toList());
System.out.println("去重后的列表: " + distinctNumbers);

// 5. sorted: 对元素进行排序
System.out.println("\nsorted 操作示例:");
List<Integer> sortedNumbers = numbers.stream()
.sorted()
.collect(Collectors.toList());
System.out.println("排序后的列表: " + sortedNumbers);

// 终端操作示例

// 6. forEach: 遍历元素
System.out.println("\nforEach 操作示例:");
numbers.stream()
.forEach(n -> System.out.print(n + " "));
System.out.println();

// 7. collect: 收集元素到列表
System.out.println("\ncollect 操作示例:");
List<Integer> collectedNumbers = numbers.stream()
.collect(Collectors.toList());
System.out.println("收集到的列表: " + collectedNumbers);

// 8. reduce: 对元素进行累加
System.out.println("\nreduce 操作示例:");
Optional<Integer> sum = numbers.stream()
.reduce(Integer::sum);
sum.ifPresent(result -> System.out.println("元素总和: " + result));

// 9. count: 统计元素数量
System.out.println("\ncount 操作示例:");
long count = numbers.stream()
.count();
System.out.println("元素数量: " + count);

// 10. findFirst: 查找第一个元素
System.out.println("\nfindFirst 操作示例:");
Optional<Integer> firstNumber = numbers.stream()
.findFirst();
firstNumber.ifPresent(result -> System.out.println("第一个元素: " + result));

// 11. anyMatch: 判断是否有元素满足条件
System.out.println("\nanyMatch 操作示例:");
boolean hasEven = numbers.stream()
.anyMatch(n -> n % 2 == 0);
System.out.println("是否有偶数: " + hasEven);
}
}

只有List才能有stream吗

并非只有 List 能使用 stream 方法。在 Java 里,很多集合类以及其他类型都可以使用 stream 方法来创建流,以下为你详细介绍:

1. 实现 Collection 接口的集合类

Collection 接口拓展了 stream() 方法,所以所有实现 Collection 接口的类都能够调用此方法生成流。常见的有:

  • List:像 ArrayListLinkedList 这类。

    import java.util.ArrayList;
    import java.util.List;

    public class ListStreamExample {
    public static void main(String[] args) {
    List<String> list = new ArrayList<>();
    list.add("apple");
    list.add("banana");
    list.add("cherry");
    long count = list.stream().filter(s -> s.startsWith("a")).count();
    System.out.println("以 'a' 开头的元素数量: " + count);
    }
    }
  • Set:例如 HashSetTreeSet

    import java.util.HashSet;
    import java.util.Set;

    public class SetStreamExample {
    public static void main(String[] args) {
    Set<Integer> set = new HashSet<>();
    set.add(1);
    set.add(2);
    set.add(3);
    int sum = set.stream().mapToInt(Integer::intValue).sum();
    System.out.println("集合元素总和: " + sum);
    }
    }
  • Queue:像 LinkedList(它同时实现了 ListQueue 接口)、PriorityQueue

    import java.util.LinkedList;
    import java.util.Queue;

    public class QueueStreamExample {
    public static void main(String[] args) {
    Queue<String> queue = new LinkedList<>();
    queue.add("one");
    queue.add("two");
    queue.add("three");
    queue.stream().forEach(System.out::println);
    }
    }

2. 数组

虽然数组没有直接实现 Collection 接口,不过可以借助 Arrays 工具类的 stream 方法来创建流。

import java.util.Arrays;

public class ArrayStreamExample {
public static void main(String[] args) {
int[] array = {1, 2, 3, 4, 5};
int sum = Arrays.stream(array).sum();
System.out.println("数组元素总和: " + sum);
}
}

3. 其他

  • MapMap 本身没有 stream() 方法,但可以通过其 keySet()values() 或者 entrySet() 方法得到对应的集合,再创建流。

    import java.util.HashMap;
    import java.util.Map;

    public class MapStreamExample {
    public static void main(String[] args) {
    Map<String, Integer> map = new HashMap<>();
    map.put("apple", 1);
    map.put("banana", 2);
    map.put("cherry", 3);
    long count = map.entrySet().stream().filter(entry -> entry.getValue() > 1).count();
    System.out.println("值大于 1 的键值对数量: " + count);
    }
    }
  • Stream 类的静态方法:可以直接利用 Stream 类的静态方法创建流,例如 Stream.of()Stream.iterate()Stream.generate() 等。

    import java.util.stream.Stream;

    public class StaticStreamExample {
    public static void main(String[] args) {
    Stream<Integer> stream = Stream.of(1, 2, 3, 4, 5);
    int sum = stream.mapToInt(Integer::intValue).sum();
    System.out.println("流元素总和: " + sum);
    }
    }

综上所述,在 Java 里有多种方式能够创建流,List 只是其中一种可以创建流的类型。

Kafka

同一个消费组对于一个分区是共享同一个offset的是吧

是的,在 Kafka 里,同一个消费组对于一个分区共享同一个 offset。下面从原理、目的和影响等方面详细解释:

原理

  • 分区分配机制:Kafka 为了保证消息在分区内的顺序消费和避免重复消费,采用了一个分区只能被一个消费组内的一个消费者消费的策略。这意味着在同一个消费组中,针对特定的一个分区,只会有一个消费者负责处理该分区的消息。既然只有一个消费者处理这个分区,那么该分区对应的消费偏移量(offset)在这个消费组内就是唯一的,也就是共享同一个 offset。
  • offset 提交:消费者在消费完消息后,需要向 Kafka 提交 offset,以此告知 Kafka 该消费组已经消费到了分区的哪个位置。在一个消费组里,无论哪个消费者实际消费了分区的消息,最终提交的都是该分区统一的 offset。例如,消费者 A 消费了分区 P0 的部分消息后提交 offset,这个 offset 就代表了整个消费组在分区 P0 的消费进度。

目的

  • 保证消息顺序性:确保同一个消费组对分区内的消息按顺序处理。如果每个消费者都有自己独立的 offset,就可能出现多个消费者乱序消费分区消息的情况,破坏了消息在分区内的顺序性。
  • 避免重复消费:统一的 offset 能让消费组清楚知道哪些消息已经被消费过。当消费者出现故障重启或者进行分区重平衡后,能从正确的 offset 位置继续消费,避免重复处理已经消费过的消息。

影响

  • 故障恢复:若消费组内负责某分区的消费者出现故障,Kafka 会将该分区分配给其他消费者,新的消费者会从共享的 offset 位置继续消费,保证消费的连续性。
  • 分区重平衡:当消费组发生分区重平衡时,所有消费者会重新分配分区,每个分区对应的共享 offset 能帮助新接手的消费者准确找到消费起点。

Spring Annotations

Spring has numerous annotations. For example, @Component is used to mark ordinary POJO classes as components, allowing Spring to automatically scan and manage them. @Service is typically used to mark classes in the service layer, and @Repository is used for classes in the data access layer. Essentially, they are specialized forms of @Component with more specific semantics. @Autowired is used for automatic bean wiring, and @RequestMapping is used to handle request mappings and so on.

@Controller vs @RestController

@Controller is an annotation in Spring MVC used to mark controller classes. It usually works with view technologies such as JSP and is used to return views. @RestController, on the other hand, is a combination of @Controller and @ResponseBody. It means that the methods in this controller class will by default return the return value directly to the client as the response body. It is suitable for building RESTful APIs, and the returned data is usually in formats such as JSON or XML.

In Spring MVC, the difference between @Controller and @RestController mainly lies in how they handle responses:

  • • @Controller is used to return views (HTML, JSP, etc.), making it suitable for traditional web applications.
  • • @RestController is used to return JSON or XML data, making it ideal for RESTful APIs.

1. @Controller Example (Returning a View)

By default, @Controller returns a view name. To return JSON, you must use @ResponseBody.

@Controller
public class MyController {

@GetMapping("/hello")
public String hello(Model model) {
model.addAttribute("message", "Hello, Spring!");
return "helloPage"; // Returns a view name, not JSON
}
}

📌 Explanation

  • • The return “helloPage”; statement does not return JSON. Instead, it looks for a helloPage.html or helloPage.jsp view.
  • • To return JSON from @Controller, you must explicitly add @ResponseBody:
  • @Controller
    public class MyController {

    @GetMapping("/json")
    @ResponseBody // Ensures the response is JSON instead of a view
    public String jsonResponse() {
    return "{\"message\": \"Hello, JSON!\"}";
    }
    }

2. @RestController Example (Returning JSON)

@RestController returns JSON by default, without requiring @ResponseBody.

@RestController
public class MyRestController {

@GetMapping("/api/hello")
public Map<String, String> helloJson() {
Map<String, String> response = new HashMap<>();
response.put("message", "Hello, JSON!");
return response; // Automatically converted to JSON
}
}

📌 Explanation

  • • @RestController is a combination of @Controller and @ResponseBody, so there’s no need to add @ResponseBody manually.
  • • Spring Boot automatically converts the Map response into JSON:

{
“message”: “Hello, JSON!”
}

3. When to Use @Controller vs @RestController?

✅ Use @Controller 👉 If your application needs to return HTML pages (traditional MVC web apps).
✅ Use @RestController 👉 If you’re building a REST API that returns JSON data.

In short:

  • • If your app is frontend-backend separated (React, Vue, Angular consuming JSON), use @RestController.
  • • If your app renders views on the server and serves HTML pages, use @Controller.

🚀 If you’re unsure, prefer @RestController—it aligns with modern web development practices!

  • @Qualifier, @Primary
    @Qualifier is used to specify the name or qualification conditions of the specific bean to be wired during autowiring. When there are multiple beans of the same type, @Qualifier can be used to clearly specify which one to use. @Primary is used to mark a bean as the preferred candidate for autowiring. When there are multiple beans of the same type, Spring will preferentially select the bean marked with @Primary for autowiring.
  • Spring Cache and Retry
    Spring Cache provides declarative cache support. Annotations like @Cacheable, @CachePut, and @CacheEvict can be easily used to cache the results of methods, improving system performance and reducing database access. Spring Retry, on the other hand, offers a mechanism to automatically retry operations that fail due to certain conditions, such as network failures or temporary database unavailability. It can be configured through annotations and configuration classes to specify retry policies, including the number of retries, retry intervals, and conditions for triggering retries.

Spring

  • How to create spring boot from Scratch
    To create a Spring Boot project from scratch, you can start by choosing a build tool like Maven or Gradle. Then, create a basic project structure with directories for source code, resources, etc. Add the necessary Spring Boot dependencies to the build configuration file. Define the main application class, which is usually annotated with @SpringBootApplication. You can then start adding controllers, services, and other components as needed to build your application.
  • Benefit of Spring boot
    Spring Boot simplifies the development of Spring applications. It provides auto-configuration, which reduces the need for extensive XML or Java configuration. It also comes with a built-in embedded server, making it easy to run and deploy applications. Spring Boot starters allow for quick addition of common dependencies and functionality. Additionally, it offers features like actuator endpoints for monitoring and managing the application, and it makes it easier to handle configuration properties and profiles.
  • Annotation @SpringBootApplication
    @SpringBootApplication is a composite annotation that combines @SpringBootConfiguration, @EnableAutoConfiguration, and @ComponentScan. It is used to mark the main application class and tells Spring Boot to start the application, perform auto-configuration, and scan for components in the specified packages and their sub-packages.
  • AutoConfiguration, how to disable
    Spring Boot’s auto-configuration automatically configures beans based on the dependencies present in the project. To disable specific auto-configurations, you can use the @SpringBootApplication(exclude = {SomeAutoConfiguration.class}) annotation at the application level. You can also set properties in the application.properties or application.yml file to disable certain auto-configurations, for example, spring.autoconfigure.exclude=com.example.SomeAutoConfiguration.
  • Actuator
    Spring Boot Actuator provides endpoints that allow you to monitor and manage your application. It offers endpoints such as /health to check the health of the application, /metrics to view various metrics about the application’s performance, /info to get information about the application, and many others. These endpoints can be used for debugging, monitoring, and optimizing the application.

  • Spring AciveProfile
    Spring ActiveProfile allows you to define different profiles for your application, such as dev, test, prod, etc. You can configure different beans, properties, and behaviors for each profile. By setting the spring.profiles.active property, you can switch between different profiles at runtime, enabling different configurations for different environments.

  • @ExceptionHandler, @ControllerAdvice
    @ExceptionHandler is used in a controller class to handle specific exceptions that occur within the methods of that controller. @ControllerAdvice is a global exception handling mechanism. It allows you to define a class that can handle exceptions across multiple controllers. You can use @ExceptionHandler within a @ControllerAdvice class to handle different types of exceptions globally and return appropriate error responses to the client.

In Java and Spring framework, Dependency Injection (DI) is a design pattern that allows objects to receive dependencies from external sources rather than creating them internally. There are several ways to implement DI, and here are the common ones:

Different ways of DI

1. Constructor Injection

  • Principle: Dependencies are passed to the object through its constructor. This is the most recommended way as it enforces the immutability of dependencies and makes the object fully initialized when it is created.
  • Example:

In this example, the Car class depends on the Engine class, and the Engine instance is passed through the Car constructor.

2. Setter Injection

  • Principle: Dependencies are set through setter methods of the object. This allows for optional dependencies and the ability to change dependencies after the object is created.
  • Example:

Here, the Document class has a setPrinter method to set the Printer dependency.

3. Interface Injection

  • Principle: A class implements an interface that defines a method to inject the dependency. This approach is less common in practice compared to constructor and setter injection.
  • Example:

In this example, the Client class implements the ServiceInjector interface to receive the Service dependency.

4. Field Injection

  • Principle: Dependencies are directly injected into the fields of a class using annotations (commonly used in Spring framework). It is simple but has some drawbacks like making the class harder to test in isolation.
  • Example:

In this example, the Database dependency is injected directly into the Application class field using a custom @Autowired-like annotation.

Comparison

Injection Method Advantages Disadvantages
Constructor Injection Enforces immutability, makes dependencies clear, object is fully initialized on creation Not suitable for optional dependencies
Setter Injection Allows for optional dependencies, can change dependencies after creation Object may be in an inconsistent state if dependencies are not set
Interface Injection Provides a clear contract for dependency injection More complex and less commonly used
Field Injection Simple and concise Harder to test in isolation, object may be in an inconsistent state

how to solve Circular Dependency

Circular dependency occurs when two or more components depend on each other either directly or indirectly. This can cause issues such as infinite loops during the creation process and make the codebase hard to understand and maintain. Here are some common ways to solve circular dependencies in different programming contexts, with a focus on Java and Spring framework:

2. Use Setter Injection Instead of Constructor Injection

  • Principle: Constructor injection enforces that all dependencies are provided at the time of object creation. By using setter injection, we can create the objects first and then set the dependencies later, breaking the circular creation loop.
  • Example in Spring:
    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.stereotype.Component;

    @Component
    class A {
    private B b;

    @Autowired
    public void setB(B b) {
    this.b = b;
    }
    }

    @Component
    class B {
    private A a;

    @Autowired
    public void setA(A a) {
    this.a = a;
    }
    }

In Spring, the container can create the A and B objects first and then use the setter methods to inject the dependencies.

3. Use a Factory Pattern

  • Principle: A factory class can be used to create objects and manage their dependencies. This way, the objects can be created without immediately resolving all the dependencies.
  • Example:
    class Factory {
    private static A a;
    private static B b;

    public static A getA() {
    if (a == null) {
    a = new A();
    if (b == null) {
    b = new B();
    }
    a.setB(b);
    }
    return a;
    }

    public static B getB() {
    if (b == null) {
    b = new B();
    if (a == null) {
    a = new A();
    }
    b.setA(a);
    }
    return b;
    }
    }

    class A {
    private B b;

    public void setB(B b) {
    this.b = b;
    }
    }

    class B {
    private A a;

    public void setA(A a) {
    this.a = a;
    }
    }

The Factory class ensures that the objects are created and their dependencies are set in a controlled way.

4. Use Lazy Initialization

  • Principle: Instead of initializing the dependencies immediately, delay the initialization until they are actually needed. In Spring, this can be achieved using the @Lazy annotation.
  • Example in Spring:
    import org.springframework.beans.factory.annotation.Autowired;
    import org.springframework.context.annotation.Lazy;
    import org.springframework.stereotype.Component;

    @Component
    class A {
    private final B b;

    @Autowired
    public A(@Lazy B b) {
    this.b = b;
    }
    }

    @Component
    class B {
    private final A a;

    @Autowired
    public B(@Lazy A a) {
    this.a = a;
    }
    }

The @Lazy annotation tells Spring to create the bean only when it is first accessed, which can break the circular creation loop.

scope有哪些类型

1. Spring 中 scope 的类型

在 Spring 框架里,scope 定义了 Spring 容器如何创建和管理 Bean 的实例。以下是 Spring 中常见的 scope 类型:

(1)singleton(单例)

  • 描述:这是 Spring 里默认的 scope 类型。当一个 Bean 被定义为单例时,在整个 Spring 容器的生命周期中,只会创建该 Bean 的一个实例。所有对这个 Bean 的请求都会返回同一个实例。
  • 示例配置
    <bean id="mySingletonBean" class="com.example.MyBean" scope="singleton"/>

或者使用注解:

import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;

@Component
@Scope("singleton")
public class MyBean {
// 类的内容
}

(2)prototype(原型)

  • 描述:每次请求该 Bean 时,Spring 容器都会创建一个新的实例。也就是说,对 prototype 作用域的 Bean 进行多次请求,会得到不同的实例。
  • 示例配置
    <bean id="myPrototypeBean" class="com.example.MyBean" scope="prototype"/>

或者使用注解:

import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;

@Component
@Scope("prototype")
public class MyBean {
// 类的内容
}

(3)request(请求)

  • 描述:仅适用于基于 Web 的 Spring 应用。在一次 HTTP 请求的生命周期内,容器会创建并返回同一个 Bean 实例。不同的 HTTP 请求会得到不同的实例。
  • 示例配置
    <bean id="myRequestBean" class="com.example.MyBean" scope="request"/>

或者使用注解:

import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;
import org.springframework.web.context.WebApplicationContext;

@Component
@Scope(WebApplicationContext.SCOPE_REQUEST)
public class MyBean {
// 类的内容
}

(4)session(会话)

  • 描述:同样适用于基于 Web 的 Spring 应用。在一个用户会话的生命周期内,容器会创建并返回同一个 Bean 实例。不同的用户会话会得到不同的实例。
  • 示例配置
    <bean id="mySessionBean" class="com.example.MyBean" scope="session"/>

或者使用注解:

import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;
import org.springframework.web.context.WebApplicationContext;

@Component
@Scope(WebApplicationContext.SCOPE_SESSION)
public class MyBean {
// 类的内容
}

(5)application(应用)

  • 描述:适用于基于 Web 的 Spring 应用。在整个 Web 应用的生命周期内,容器会创建并返回同一个 Bean 实例。
  • 示例配置
    <bean id="myApplicationBean" class="com.example.MyBean" scope="application"/>

或者使用注解:

import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;
import org.springframework.web.context.WebApplicationContext;

@Component
@Scope(WebApplicationContext.SCOPE_APPLICATION)
public class MyBean {
// 类的内容
}

(6)websocket(WebSocket)

  • 描述:在 Spring 4.2 及以后版本引入,用于 WebSocket 相关的应用。在一个 WebSocket 会话的生命周期内,容器会创建并返回同一个 Bean 实例。
  • 示例配置
    import org.springframework.context.annotation.Scope;
    import org.springframework.stereotype.Component;
    import org.springframework.web.socket.config.annotation.WebSocketScopeMetadataResolver;

    @Component
    @Scope(WebSocketScopeMetadataResolver.SCOPE_WEBSOCKET)
    public class MyBean {
    // 类的内容
    }

2. 将 prototype Bean 注入到 singleton Bean 中会发生什么

当把一个 prototype 作用域的 Bean 注入到 singleton 作用域的 Bean 中时,需要注意以下情况:

(1)默认行为

在默认情况下,singleton Bean 在创建时会获取 prototype Bean 的一个实例,并且在 singleton Bean 的整个生命周期中都使用这个实例。这意味着,尽管 prototype 作用域的设计初衷是每次请求都创建新实例,但由于 singleton Bean 只创建一次,它只会在创建时获取一次 prototype Bean 的实例,后续不会再获取新的 prototype Bean 实例。

(2)示例代码

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Scope;

// 定义 prototype Bean
@Configuration
class AppConfig {
@Bean
@Scope("prototype")
public PrototypeBean prototypeBean() {
return new PrototypeBean();
}

@Bean
public SingletonBean singletonBean() {
return new SingletonBean();
}
}

class PrototypeBean {
private int counter = 0;

public PrototypeBean() {
counter++;
System.out.println("PrototypeBean 实例创建,计数器: " + counter);
}

public int getCounter() {
return counter;
}
}

class SingletonBean {
@Autowired
private PrototypeBean prototypeBean;

public PrototypeBean getPrototypeBean() {
return prototypeBean;
}
}

public class Main {
public static void main(String[] args) {
AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext(AppConfig.class);
SingletonBean singletonBean1 = context.getBean(SingletonBean.class);
SingletonBean singletonBean2 = context.getBean(SingletonBean.class);

System.out.println("singletonBean1 的 prototypeBean 计数器: " + singletonBean1.getPrototypeBean().getCounter());
System.out.println("singletonBean2 的 prototypeBean 计数器: " + singletonBean2.getPrototypeBean().getCounter());

context.close();
}
}

(3)代码解释

在上述代码中,PrototypeBeanprototype 作用域的 Bean,SingletonBeansingleton 作用域的 Bean,并且 SingletonBean 注入了 PrototypeBean。运行代码会发现,singletonBean1singletonBean2 中注入的是同一个 PrototypeBean 实例,因为 SingletonBean 在创建时只获取了一次 PrototypeBean 实例,后续不会再获取新的实例。

(4)解决方案

如果想要每次使用 prototype Bean 时都获取一个新的实例,可以使用 ObjectFactory 或者 Provider 来实现。例如:

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Scope;
import org.springframework.beans.factory.ObjectFactory;

// 定义 prototype Bean
@Configuration
class AppConfig {
@Bean
@Scope("prototype")
public PrototypeBean prototypeBean() {
return new PrototypeBean();
}

@Bean
public SingletonBean singletonBean() {
return new SingletonBean();
}
}

class PrototypeBean {
private int counter = 0;

public PrototypeBean() {
counter++;
System.out.println("PrototypeBean 实例创建,计数器: " + counter);
}

public int getCounter() {
return counter;
}
}

class SingletonBean {
@Autowired
private ObjectFactory<PrototypeBean> prototypeBeanFactory;

public PrototypeBean getPrototypeBean() {
return prototypeBeanFactory.getObject();
}
}

public class Main {
public static void main(String[] args) {
AnnotationConfigApplicationContext context = new AnnotationConfigApplicationContext(AppConfig.class);
SingletonBean singletonBean = context.getBean(SingletonBean.class);

PrototypeBean prototypeBean1 = singletonBean.getPrototypeBean();
PrototypeBean prototypeBean2 = singletonBean.getPrototypeBean();

System.out.println("prototypeBean1 计数器: " + prototypeBean1.getCounter());
System.out.println("prototypeBean2 计数器: " + prototypeBean2.getCounter());

context.close();
}
}

在这个改进后的代码中,SingletonBean 通过 ObjectFactory 来获取 PrototypeBean 实例,每次调用 getObject() 方法时都会得到一个新的 PrototypeBean 实例。

经典例子

@Component
@Scope(value = "prototype")
public class PrototypeBean {}

@Component
@Scope(value = "prototype")
public class TestCls {
@Autowired
private PrototypeBean prototypeBean;

public void execute() {
// 每次调用 prototypeBean 都是新的吗
System.out.println(prototypeBean);
}
}

TestCls s1 = context.getBean(TestCls.class); // s1 的 prototypeBean 是 p1
TestCls s2 = context.getBean(TestCls.class); // s2 的 prototypeBean 是 p2

s1.execute(); // 输出的是 p1
s1.execute(); // 还是 p1(同一个)

s2.execute(); // 输出的是 p2

结论:prototypeBean 不是每次都是新的,是只注入一次。

原因解析:
• SingletonBean 本身也是 @Scope(“prototype”),所以 Spring 每次获取它时会创建一个新实例。
• 但 PrototypeBean 仍然是随着 SingletonBean 被创建时只注入一次。
• 所以每个 SingletonBean 实例有自己的 prototypeBean 实例,但 execute() 多次调用打印的是同一个 prototypeBean 对象。

AOP

AOP(Aspect-Oriented Programming)即面向切面编程,是一种编程范式,它允许开发者在不修改原有业务逻辑的基础上,对程序进行增强,比如添加日志记录、事务管理、权限验证等功能。在 Java 中,Spring AOP 是实现 AOP 编程的常用框架,下面为你详细介绍 AOP 的相关概念,并给出示例代码。

相关概念

  • 切面(Aspect):封装了横切关注点的类,包含了多个通知和切点。
  • 通知(Advice):定义了在何时执行何种操作,常见的通知类型有前置通知、后置通知、环绕通知、异常通知和最终通知。
  • 切点(Pointcut):定义了在哪些连接点上应用通知,即确定哪些方法会被增强。
  • 连接点(Join point):程序执行过程中可以插入切面的点,通常是方法调用。

示例代码

下面是一个使用 Spring AOP 实现日志记录的示例:

package com.example.aop;

import org.aspectj.lang.JoinPoint;
import org.aspectj.lang.annotation.After;
import org.aspectj.lang.annotation.Aspect;
import org.aspectj.lang.annotation.Before;
import org.aspectj.lang.annotation.Pointcut;
import org.springframework.stereotype.Component;

@Aspect
@Component
public class LoggingAspect {

@Pointcut("execution(* com.example.aop.UserService.*(..))")
public void userServiceMethods() {}

@Before("userServiceMethods()")
public void beforeAdvice(JoinPoint joinPoint) {
System.out.println("Before method: " + joinPoint.getSignature().getName());
}

@After("userServiceMethods()")
public void afterAdvice(JoinPoint joinPoint) {
System.out.println("After method: " + joinPoint.getSignature().getName());
}
}

代码解释

  1. Main 类:这是程序的入口类,使用 AnnotationConfigApplicationContext 来加载 Spring 配置,并获取 UserService 实例,调用 addUser 方法。
  2. UserService 类:这是一个简单的业务服务类,包含一个 addUser 方法,用于添加用户。
  3. LoggingAspect 类:这是一个切面类,包含以下内容:
    • @Pointcut 注解定义了一个切点,匹配 com.example.aop.UserService 类中的所有方法。
    • @Before 注解定义了一个前置通知,在目标方法执行之前执行,输出方法名。
    • @After 注解定义了一个后置通知,在目标方法执行之后执行,输出方法名。

运行结果

运行上述代码,输出结果如下:

Before method: addUser
Adding user: John
After method: addUser

通过这个示例,你可以看到 AOP 如何在不修改 UserService 类原有业务逻辑的基础上,添加了日志记录功能。

Data Access

  • JDBC, statement vs PreparedStatement, Datasource
    JDBC (Java Database Connectivity) is an API for interacting with databases in Java. Statement is used to execute SQL statements directly, but it is vulnerable to SQL injection attacks. PreparedStatement is a more secure and efficient alternative. It allows you to precompile SQL statements and set parameters, preventing SQL injection. A DataSource is a factory for connections to a database. It manages the connection pool and provides connections to the application.
  • Hibernate ORM, Session, Cache
    Hibernate ORM is an Object Relational Mapping framework that allows you to map Java objects to database tables. A Session in Hibernate is a lightweight, short-lived object that provides an interface to interact with the database. It is used to perform operations like saving, loading, and deleting objects. Hibernate also has a caching mechanism to improve performance. It can cache objects in memory to reduce database access. There are different levels of caches, such as the first-level cache (session-level cache) and the second-level cache (shared cache across sessions).
  • Optimistic Locking - add version column
    Optimistic locking is a concurrency control mechanism used in databases. In the context of Hibernate, it can be implemented by adding a version column to the database table. When an object is loaded, the version number is also loaded. When the object is updated, Hibernate checks if the version number has changed. If it has, it means the object has been modified by another transaction, and the update will fail, preventing data conflicts.
  • Association: many - to - many
    In object-relational mapping, a many-to-many association is used when multiple objects of one entity can be related to multiple objects of another entity. For example, in a system with users and roles, a user can have multiple roles, and a role can be assigned to multiple users. In Hibernate, this is usually mapped using a join table and appropriate annotations like @ManyToMany and @JoinTable.

Transaction JPA

  • @Transactional - atomic operation
    The @Transactional annotation in Spring JPA is used to mark a method or a class as a transactional operation. It ensures that the operations within the method are executed atomically. That is, either all the operations succeed and are committed to the database, or if an error occurs, all the operations are rolled back, maintaining data consistency.
  • Propagation, Isolation
    Transaction propagation defines how a transaction should behave when a transactional method calls another transactional method. There are several propagation types like REQUIRED, REQUIRES_NEW, SUPPORTS, etc. Isolation levels define the degree to which one transaction is isolated from other transactions. Common isolation levels are READ_UNCOMMITTED, READ_COMMITTED, REPEATABLE_READ, and SERIALIZABLE. Each level has different trade-offs in terms of data consistency and concurrency.
  • JPA naming convention
    JPA has certain naming conventions for mapping entity classes to database tables and columns. By default, it uses a naming strategy where the entity class name is mapped to the table name, and the property names are mapped to column names. However, you can also customize the naming using annotations like @Table and @Column to specify different names if needed.
  • Paging and Sorting Using JPA
    JPA provides support for paging and sorting data. You can use the Pageable interface and related classes to specify the page number, page size, and sorting criteria. For example, you can use methods like findAll(Pageable pageable) in a JPA repository to retrieve a paginated and sorted list of entities.
  • Hibernate Persistence Context
    The Hibernate persistence context is a set of managed entities that are associated with a particular session. It tracks the state of the entities and is responsible for synchronizing the changes between the entities and the database. It manages the lifecycle of the entities, including loading, saving, and deleting them.

Security

  • How to implement Security by overriding Spring class
    You can implement security in a Spring application by overriding certain Spring security classes. For example, you can extend WebSecurityConfigurerAdapter and override methods like configure(HttpSecurity http) to define custom security configurations such as access rules, authentication mechanisms, etc. You can also override other classes like UserDetailsService to provide custom user authentication and authorization logic.
  • Basic Authentication and password encryption
    Basic authentication is a simple authentication mechanism where the client sends the username and password in the request headers. In Spring, it can be configured easily. Password encryption is crucial for security. Spring provides various password encoding mechanisms like BCryptPasswordEncoder to securely hash and store passwords. When a user registers or changes their password, the password is encrypted and stored in the database, and during authentication, the provided password is encrypted and compared with the stored hash.
  • JWT Token and workflow
    JSON Web Token (JWT) is a widely used token-based authentication and authorization mechanism. The workflow typically involves the client sending username and password to the server for authentication. If the authentication is successful, the server generates a JWT token containing user information and a signature. The client then stores the token and sends it in the headers of subsequent requests. The server validates the token on each request and authorizes the user based on the information in the token.
  • Oauth2 workflow
    OAuth2 is an authorization framework that allows users to grant limited access to their resources on one server to another server without sharing their credentials. The typical OAuth2 workflow involves steps like the client redirecting the user to the authorization server for authentication and authorization, the user granting permission, the authorization server issuing an access token, and the client using the access token to access protected resources on the resource server.
  • Authorization based on User role
    In a Spring security application, authorization based on user roles can be implemented by assigning different roles to users and configuring access rules based on those roles. You can use annotations like @PreAuthorize or configure access rules in the security configuration to specify which roles are allowed to access which resources or perform which operations. For example, you can define that only users with the ROLE_ADMIN role can access certain administrative endpoints.

REST API

在 Spring 框架里,@ExceptionHandler@ControllerAdvice 属于重要的注解,它们主要用于处理控制器层面的异常。下面分别介绍它们的用法:

怎么处理Controller的异常

@ExceptionHandler

@ExceptionHandler 注解能够在控制器类中定义异常处理方法,这样当控制器方法抛出指定类型的异常时,就会调用这些处理方法。

示例代码

import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class MyController {

@RequestMapping("/test")
public String test() {
throw new RuntimeException("测试异常");
}

@ExceptionHandler(RuntimeException.class)
public String handleRuntimeException(RuntimeException e) {
return "处理运行时异常: " + e.getMessage();
}
}

代码解释

  • @RequestMapping("/test") 方法在被调用时会抛出 RuntimeException
  • @ExceptionHandler(RuntimeException.class) 注解的 handleRuntimeException 方法会处理 RuntimeException 类型的异常,并且返回包含异常信息的字符串。

@ControllerAdvice

@ControllerAdvice 注解可用于定义全局的异常处理类,这个类里的 @ExceptionHandler 方法能处理所有控制器抛出的异常。

示例代码

import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.ResponseStatus;
import org.springframework.web.bind.annotation.RestControllerAdvice;

@RestControllerAdvice
public class GlobalExceptionHandler {

@ExceptionHandler(RuntimeException.class)
@ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR)
public String handleRuntimeException(RuntimeException e) {
return "全局处理运行时异常: " + e.getMessage();
}
}

代码解释

  • @RestControllerAdvice 表明这是一个全局的异常处理类。
  • @ExceptionHandler(RuntimeException.class) 注解的 handleRuntimeException 方法会处理所有控制器抛出的 RuntimeException 类型的异常。
  • @ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR) 把响应状态码设为 500。

结合使用

通常会把 @ExceptionHandler@ControllerAdvice 结合起来使用,以此实现全局的异常处理。

示例代码

import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.ResponseStatus;
import org.springframework.web.bind.annotation.RestControllerAdvice;

@RestControllerAdvice
public class GlobalExceptionHandler {

@ExceptionHandler(Exception.class)
@ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR)
public String handleException(Exception e) {
return "全局处理异常: " + e.getMessage();
}
}

代码解释

  • @RestControllerAdvice 注解的 GlobalExceptionHandler 类是全局异常处理类。
  • @ExceptionHandler(Exception.class) 注解的 handleException 方法会处理所有控制器抛出的 Exception 类型的异常。
  • @ResponseStatus(HttpStatus.INTERNAL_SERVER_ERROR) 把响应状态码设为 500。

通过上述的介绍,你可以在 Spring 应用中运用 @ExceptionHandler@ControllerAdvice 注解来处理异常,从而提升应用的健壮性与用户体验。

1. DispatcherServlet

Definition

DispatcherServlet is a key component in the Spring Web MVC framework. It serves as the front - controller in a Spring - based web application. A front - controller is a single servlet that receives all HTTP requests and then dispatches them to the appropriate handlers (controllers) based on the request’s URL, HTTP method, and other criteria.

Function

  • Request Routing: It maps incoming requests to the appropriate @Controller classes and their methods using the configured handler mappings. For example, it can match a request to a specific controller method based on the URL pattern defined in the @RequestMapping annotation.
  • View Resolution: After a controller method processes the request and returns a logical view name, the DispatcherServlet uses a view resolver to map this logical name to an actual view (such as a JSP page or a Thymeleaf template) and renders the response.
  • Intercepting and Pre - processing: It can also use interceptors to perform pre - processing and post - processing tasks on requests and responses, like logging, authentication checks, etc.

2. Rest API

Definition

REST (Representational State Transfer) is an architectural style for building web services. A REST API (Application Programming Interface) is a set of rules and conventions for creating and consuming web services based on the REST principles.

Characteristics

  • Stateless: Each request from a client to a server must contain all the information necessary to understand and process the request. The server does not store any client - specific state between requests.
  • Resource - Oriented: Resources are the key abstractions in a REST API. Resources can be things like users, products, or orders, and are identified by unique URIs (Uniform Resource Identifiers).
  • HTTP Verbs: REST APIs use standard HTTP methods (verbs) to perform operations on resources. For example, GET is used to retrieve a resource, POST to create a new resource, PUT to update an existing resource, and DELETE to remove a resource.

3. How to create a good REST API

Design Principles

  • Use Clear and Descriptive URIs: URIs should clearly represent the resources. For example, use /users to represent a collection of users and /users/{userId} to represent a specific user.
  • Follow HTTP Verbs Correctly: Use GET for retrieval, POST for creation, PUT for full - update, PATCH for partial - update, and DELETE for deletion.
  • Return Appropriate HTTP Status Codes: Indicate the result of the request clearly. For example, return 200 for successful retrievals, 201 for successful creations, and 4xx or 5xx for errors.
  • Provide Good Documentation: Use tools like Swagger to generate documentation that explains the API endpoints, their input parameters, and expected output.

Security and Performance

  • Authentication and Authorization: Implement proper authentication mechanisms (e.g., OAuth, JWT) to ensure that only authorized users can access the API.
  • Caching: Implement caching strategies to reduce the load on the server and improve response times.

4. HTTP Error Codes

  • 200 OK: Indicates that the request has succeeded. It is commonly used for successful GET requests to retrieve a resource or successful PUT/PATCH requests to update a resource.
  • 201 Created: Used when a new resource has been successfully created. For example, when a client sends a POST request to create a new user, and the server successfully creates the user, it returns a 201 status code.
  • 400 Bad Request: Signifies that the server cannot process the request due to a client - side error, such as malformed request syntax, invalid request message framing, or deceptive request routing.
  • 401 Unauthorized: Indicates that the request requires user authentication. The client needs to provide valid credentials to access the requested resource.
  • 403 Forbidden: The client is authenticated, but it does not have permission to access the requested resource. For example, a regular user trying to access an administrative - only endpoint.
  • 404 Not Found: The requested resource could not be found on the server. This might be because the URL is incorrect or the resource has been deleted.
  • 500 Internal Server Error: A generic error message indicating that the server encountered an unexpected condition that prevented it from fulfilling the request. It could be due to a programming error, database issues, etc.
  • 502 Bad Gateway: The server, while acting as a gateway or proxy, received an invalid response from an upstream server.
  • 503 Service Unavailable: The server is currently unable to handle the request due to temporary overloading or maintenance. The client may try again later.
  • 504 Gateway Timeout: The server, while acting as a gateway or proxy, did not receive a timely response from an upstream server.

5. Introduction of GraphQL, WebSocket, gRPC

GraphQL

  • Definition: GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. It allows clients to specify exactly what data they need from an API, reducing over - fetching and under - fetching of data.
  • Advantages: It provides a more efficient way of data retrieval compared to traditional REST APIs, especially in complex applications where clients may need different subsets of data. It also has a strong type system and can be introspected by clients.

WebSocket

  • Definition: WebSocket is a communication protocol that provides full - duplex communication channels over a single TCP connection. It enables real - time communication between a client and a server.
  • Advantages: It reduces the overhead of traditional HTTP requests by maintaining a persistent connection, which is suitable for applications that require real - time updates, such as chat applications, online gaming, and live dashboards.

gRPC

  • Definition: gRPC is a high - performance, open - source universal RPC (Remote Procedure Call) framework. It uses Protocol Buffers as the interface definition language and serialization format.
  • Advantages: It offers high performance, low latency, and strong typing. It is suitable for microservices architectures where efficient communication between services is crucial.

6. ReactiveJava

Definition

ReactiveJava is a Java implementation of the Reactive Extensions (Rx) library. It is used for reactive programming, which is a programming paradigm that deals with asynchronous data streams and the propagation of change.

Key Concepts

  • Observable: Represents a source of data that can emit zero or more items over time. An Observable can emit data synchronously or asynchronously.
  • Subscriber: A Subscriber subscribes to an Observable to receive the emitted items. It can react to the data, errors, or the completion of the data stream.
  • Operators: ReactiveJava provides a rich set of operators that can be used to transform, filter, combine, and manipulate the data streams. For example, the map operator can be used to transform each item in the stream, and the filter operator can be used to filter out unwanted items.

Use Cases

  • Asynchronous Programming: It simplifies asynchronous programming by providing a declarative way to handle asynchronous operations. For example, in a web application, it can be used to handle multiple asynchronous API calls and combine their results.
  • Event - Driven Programming: It is well - suited for event - driven applications where events need to be processed in a reactive and efficient manner. For example, in a GUI application, it can be used to handle user input events and update the UI accordingly.

MongoDB vs Cassandra introduction

MongoDB and Cassandra are both popular NoSQL databases, but they have different characteristics and use cases. Here is an introduction to their differences:

Data Model

  • MongoDB: It uses a document-oriented data model. Data is stored in BSON (Binary JSON) format, which allows for flexible and nested data structures. Documents can have different fields and structures within the same collection, making it suitable for applications where the data schema may change frequently or is not well-defined upfront. For example, in a content management system, different types of content like blog posts, images, and videos can be stored in the same collection with each document having its own set of relevant fields.
  • Cassandra: It employs a column-oriented data model. Data is organized into tables, rows, and columns, similar to a traditional relational database, but with more flexibility. However, the columns are dynamic, and a row can have a different set of columns than other rows. Cassandra is optimized for handling large amounts of data with a high write throughput and is often used in applications that require efficient storage and retrieval of time-series data or data that needs to be partitioned and distributed across multiple nodes. For instance, in an IoT (Internet of Things) application, sensor data can be stored in Cassandra with each sensor’s data as a row and the timestamp and sensor readings as columns.

Querying Capabilities

  • MongoDB: Supports rich querying capabilities. It allows for complex queries using a JSON-like query language. You can query based on field values, use operators like $gt (greater than), $lt (less than), $in, etc., and perform queries on nested fields and arrays. It also supports indexing to improve query performance. For example, you can easily query for all blog posts authored by a specific user or all documents with a certain tag.
  • Cassandra: Its querying capabilities are more limited compared to MongoDB. Queries in Cassandra are mainly based on the primary key. You can query by the partition key and optionally the clustering key. It does not support ad-hoc queries as easily as MongoDB. However, it can perform very efficiently for the queries it is designed to handle, such as retrieving a range of data based on the primary key or querying for data within a specific partition.

Scalability and Performance

  • MongoDB: Scales horizontally using sharding. It can distribute data across multiple servers, called shards, to handle large amounts of data and high traffic. MongoDB is known for its good read performance and can handle a moderate to high number of read operations. In a sharded environment, it can route queries to the appropriate shards efficiently.
  • Cassandra: Is highly scalable and designed to handle massive amounts of data and high write loads. It uses a distributed architecture where data is replicated across multiple nodes for fault tolerance and scalability. Cassandra can handle a very high volume of write operations and can scale out easily by adding more nodes to the cluster. It is often used in applications that require continuous data ingestion, such as social media platforms or financial systems that need to handle a large number of transactions.

Use Cases

  • MongoDB: Suited for applications where the data structure is flexible and dynamic, such as content management systems, mobile applications, and web applications with evolving data requirements. It is also a good choice for applications that require complex querying and indexing capabilities, like e-commerce platforms where you need to query products based on various attributes.
  • Cassandra: Ideal for applications that deal with large volumes of data, high write throughput, and where data needs to be distributed and replicated across multiple nodes. It is commonly used in real-time analytics, IoT applications, social media platforms for storing user-generated content and activity streams, and in financial systems for handling high-frequency trading data and transaction records.

Monitoring: Splunk, Grafana, Kabana, CloudWatch

The following is an introduction to these monitoring technologies:

Splunk

  • Overview: Splunk is a powerful data analytics platform that is widely used for monitoring and analyzing machine data. It can ingest, index, and correlate data from various sources such as logs, metrics, and events.
  • Features:
    • Data Collection: It can collect data from a large number of sources including servers, applications, network devices, etc.
    • Search and Analytics: Provides a powerful search language that allows users to quickly query and analyze data to find patterns, troubleshoot issues, and gain insights.
    • Visualization: Enables users to create various visualizations like dashboards, charts, and graphs to present data in an intuitive way.
    • Alerting: Can set up alerts based on specific conditions or thresholds, notifying users when important events occur.
  • Use Cases: Commonly used in IT operations for monitoring infrastructure health, in security for detecting threats and analyzing security incidents, and in business for analyzing customer behavior and operational data.

Grafana

  • Overview: Grafana is an open-source data visualization and monitoring tool. It focuses mainly on presenting data in a visually appealing and understandable way, making it easy for users to monitor and analyze metrics.
  • Features:
    • Data Sources: Supports a wide range of data sources such as Prometheus, InfluxDB, MySQL, etc.
    • Visualization Options: Offers a variety of visualization types including line charts, bar charts, pie charts, heatmaps, and more. Users can customize dashboards to display the data they need.
    • Alerting System: Allows setting up alerts based on metric values and conditions. It can send notifications through various channels like email, Slack, etc.
    • Plugin System: Has a rich ecosystem of plugins that can extend its functionality, enabling integration with other tools and adding new features.
  • Use Cases: It is popular in DevOps and IT teams for monitoring application performance, infrastructure metrics, and for visualizing time-series data. It helps in quickly identifying trends and anomalies in the data.

Kibana

  • Overview: Kibana is an open-source data visualization and exploration tool that is closely integrated with Elasticsearch. It is used to visualize and analyze data stored in Elasticsearch.
  • Features:
    • Data Visualization: Allows users to create a variety of visualizations such as bar charts, line charts, maps, and histograms. It provides an intuitive interface for exploring and filtering data.
    • Dashboard Creation: Users can easily create and customize dashboards to display multiple visualizations in one place, providing a comprehensive view of the data.
    • Search and Filtering: Provides a powerful search and filtering functionality to quickly find and analyze specific data subsets.
    • Time-series Analysis: Specializes in analyzing time-series data, which is useful for monitoring and understanding how data changes over time.
  • Use Cases: Commonly used in log analysis, security information and event management (SIEM), and for monitoring the performance of applications and infrastructure. It is widely used in combination with Elasticsearch for large-scale data analysis and monitoring.

CloudWatch

  • Overview: CloudWatch is a monitoring and observability service provided by Amazon Web Services (AWS). It allows users to monitor AWS resources and the applications running on them.
  • Features:
    • Resource Monitoring: Automatically collects metrics from various AWS resources such as EC2 instances, RDS databases, S3 buckets, etc.
    • Custom Metrics: Allows users to define and send their own custom metrics to CloudWatch for monitoring application-specific performance indicators.
    • Alarms: Can set up alarms based on metric thresholds and events. It can trigger actions such as sending notifications, auto-scaling resources, or invoking Lambda functions.
    • Logs Management: Integrates with AWS CloudTrail and other services to collect and store logs. Users can analyze logs to gain insights into the behavior of their applications and resources.
  • Use Cases: In the AWS ecosystem, it is essential for monitoring the health and performance of cloud-based applications and infrastructure. It helps in optimizing resource utilization, detecting and resolving issues quickly, and ensuring the reliability of applications running on AWS.

Saga pattern

Saga 模式是一种设计模式,用于分布式系统和微服务架构中,以管理数据一致性,并处理涉及多个服务的事务。以下是详细介绍:

The Saga pattern is a design pattern used in distributed systems and microservices architecture to manage data consistency and handle transactions that involve multiple services. Here is a detailed introduction:

Definition and Concept

A saga is a sequence of transactions that are executed in a coordinated manner across multiple microservices or distributed components. Each transaction in the saga is a local operation within a single service, and the saga orchestrates these transactions to achieve a consistent outcome across the entire system. If any of the transactions in the saga fails, the saga must take appropriate action to roll back the changes made by the previous transactions in order to maintain data consistency.

Working Mechanism

  • Orchestration: There are two main ways to orchestrate sagas - choreography and orchestration. In choreography, the participating services communicate with each other directly to coordinate the execution of the saga steps. They follow a predefined protocol or set of messages to determine the order of execution and handle failures. In orchestration, there is a central orchestrator that controls the execution of the saga. It sends commands to the individual services to perform specific steps and monitors the progress of the saga.
  • Compensating Transactions: Each step in a saga has a corresponding compensating transaction. When a failure occurs, the compensating transactions are used to undo the effects of the previously executed steps. For example, if a saga involves creating an order in one service and reserving inventory in another, and the inventory reservation fails, the compensating transaction for the order creation service would delete the created order.

Use Cases

  • E-commerce Systems: In an e-commerce application, a saga can be used to handle complex operations such as placing an order. The saga might involve steps like creating an order record, reserving inventory, charging the customer’s credit card, and scheduling delivery. If any of these steps fail, the saga can roll back the previous steps to ensure data consistency.
  • Financial Transactions: For example, in a cross-border money transfer system, a saga can be used to manage the series of operations involved, such as deducting the amount from the sender’s account, converting the currency, and crediting the receiver’s account. If there is an issue during the currency conversion, the saga can roll back the deduction from the sender’s account.
  • Travel Booking Systems: When booking a trip that involves multiple services like flight booking, hotel reservation, and car rental, a saga can be used to ensure that all these operations are either completed successfully or rolled back in case of failures.

Advantages

  • Scalability: It allows microservices to operate independently and scale individually, as each service only needs to manage its own local transactions.
  • Flexibility: Sagas can be designed to handle complex business processes that involve multiple services and have different requirements and constraints.
  • Resilience: By using compensating transactions, sagas can recover from failures and maintain data consistency even in the face of partial successes or errors.

Disadvantages

  • Complexity: Designing and implementing sagas can be complex, especially when dealing with multiple services and complex business logic. The coordination and management of compensating transactions require careful planning and implementation.
  • Error Handling: Handling all possible error scenarios and ensuring that the compensating transactions work correctly in all cases can be challenging.
  • Performance Overhead: The need to coordinate multiple transactions and potentially execute compensating transactions can introduce some performance overhead compared to traditional single-transaction approaches.

定义与概念

一个 Saga 是一系列事务的序列,这些事务在多个微服务或分布式组件之间以协调的方式执行。Saga 中的每个事务都是单个服务内的本地操作,而 Saga 则对这些事务进行编排,以在整个系统中实现一致的结果。如果 Saga 中的任何一个事务失败,Saga 必须采取适当的措施来回滚之前事务所做的更改,以维护数据一致性。

工作机制

  • 编排:编排 Saga 主要有两种方式——编舞式和协调式。在编舞式中,参与的服务直接相互通信,以协调 Saga 步骤的执行。它们遵循预定义的协议或一组消息来确定执行顺序并处理失败情况。在协调式中,有一个中央协调器来控制 Saga 的执行。它向各个服务发送命令以执行特定步骤,并监控 Saga 的进展。
  • 补偿事务:Saga 中的每个步骤都有一个对应的补偿事务。当发生失败时,补偿事务用于撤销先前已执行步骤的影响。例如,如果一个 Saga 涉及在一个服务中创建订单,并在另一个服务中预留库存,而库存预留失败,那么订单创建服务的补偿事务将删除已创建的订单。

用例

  • 电子商务系统:在电子商务应用程序中,Saga 可用于处理复杂的操作,如下单。Saga 可能涉及的步骤包括创建订单记录、预留库存、从客户的信用卡中扣款以及安排配送。如果这些步骤中的任何一个失败,Saga 可以回滚先前的步骤,以确保数据一致性。
  • 金融交易:例如,在跨境汇款系统中,Saga 可用于管理所涉及的一系列操作,如从汇款人的账户中扣除金额、进行货币兑换以及向收款人的账户中存入款项。如果在货币兑换过程中出现问题,Saga 可以回滚从汇款人账户中扣除金额的操作。
  • 旅行预订系统:当预订一次旅行,涉及多个服务,如航班预订、酒店预订和租车时,Saga 可用于确保所有这些操作要么成功完成,要么在出现失败时回滚。

优点

  • 可扩展性:它允许微服务独立运行并单独扩展,因为每个服务只需管理其自身的本地事务。
  • 灵活性:Saga 可以被设计用来处理涉及多个服务、具有不同需求和约束条件的复杂业务流程。
  • 弹性:通过使用补偿事务,即使在部分成功或出现错误的情况下,Saga 也能从失败中恢复并维护数据一致性。

缺点

  • 复杂性:设计和实现 Saga 可能很复杂,尤其是在处理多个服务和复杂业务逻辑时。补偿事务的协调和管理需要仔细规划和实现。
  • 错误处理:处理所有可能的错误场景,并确保补偿事务在所有情况下都能正确工作,可能具有挑战性。
  • 性能开销:与传统的单事务方法相比,协调多个事务并可能执行补偿事务的需求会引入一些性能开销。

Kubernetes、EKS、WCNP、KubeCtl

Kubernetes相关

  1. What is Kubernetes?
    • Answer: Kubernetes is an open-source container orchestration platform. It automates the deployment, scaling, and management of containerized applications. It allows you to group containers into logical units for easy management and discovery, and provides features like automatic pod scheduling, load balancing, and self-healing.
  2. What are the main components of Kubernetes?
    • Answer: The main components include the control plane (containing API Server, etcd, Scheduler, Controller Manager) and the nodes (with kubelet, kube-proxy, and container runtime). The control plane manages the cluster, while nodes run the pods and containers.
  3. Explain the concept of a pod in Kubernetes.
    • Answer: A pod is the smallest and simplest unit in Kubernetes that you can create and manage. It can contain one or more closely related containers that share the same network namespace and storage volumes. Pods are used to group containers that need to be co-located and co-scheduled on the same node.

EKS相关

  1. What is EKS?
    • Answer: EKS stands for Amazon Elastic Kubernetes Service. It is a managed Kubernetes service provided by Amazon Web Services. It allows you to easily run Kubernetes on AWS without having to manage the underlying infrastructure, taking care of tasks like provisioning servers, installing Kubernetes, and maintaining the cluster.
  2. What are the advantages of using EKS?
    • Answer: Some advantages include easy integration with other AWS services, high availability as it is managed by AWS, auto-scaling capabilities to handle varying workloads, and reduced operational overhead as AWS takes care of many maintenance tasks.
  3. How does EKS handle node management?
    • Answer: EKS allows you to use AWS EC2 instances as nodes. You can define node groups and configure auto-scaling for these node groups. EKS also provides tools and APIs to manage the lifecycle of nodes, such as adding or removing nodes based on the cluster’s needs.

WCNP相关

  1. What is WCNP?
    • Answer: It’s a collection of Walmart’s internal cloud services.

KubeCtl相关

  1. What is KubeCtl?
    • Answer: KubeCtl is a command-line tool used to interact with a Kubernetes cluster. It allows you to perform various operations such as deploying applications, managing pods, services, and other Kubernetes resources, as well as viewing the status and logs of the cluster and its components.
  2. List some common KubeCtl commands.
    • Answer: Common commands include kubectl create to create resources, kubectl get to view resources, kubectl describe to get detailed information about a resource, kubectl delete to delete resources, and kubectl apply to apply configuration changes.
  3. How do you use KubeCtl to deploy an application?
    • Answer: First, you create a deployment configuration file (usually in YAML or JSON format) that defines the application’s pods, replicas, and other details. Then you use the kubectl apply -f <file> command, where <file> is the name of your configuration file. This will deploy the application to the Kubernetes cluster according to the specified configuration.

AWS Modules with examples

AWS (Amazon Web Services) offers a wide range of modules and services to build and manage various types of applications and infrastructure. Here are some of the key AWS modules with examples:

Compute Modules

  • Amazon Elastic Compute Cloud (EC2)
    • Description: A web service that provides resizable compute capacity in the cloud. It allows users to launch virtual servers, known as instances, with various operating systems and configurations.
    • Example: A startup might use EC2 instances to host their web application. They can choose an appropriate instance type based on their CPU, memory, and storage requirements. For instance, they could select a t2.micro instance for a small-scale development environment or an m5.xlarge instance for a more resource-intensive production application.
  • AWS Lambda
    • Description: A serverless compute service that lets you run code without provisioning or managing servers. It automatically scales based on the incoming request volume.
    • Example: A mobile application might use AWS Lambda to process user sign-up events. When a user signs up, the app triggers a Lambda function that validates the input, stores the user data in a database, and sends a welcome email.

Storage Modules

  • Amazon Simple Storage Service (S3)
    • Description: An object storage service that offers high scalability, data durability, and security. It is used to store and retrieve any amount of data from anywhere on the web.
    • Example: A media company could use S3 to store and distribute large video files. They can create an S3 bucket, upload the video files, and then use S3’s content delivery network (CDN) integration to serve the videos to users with low latency.
  • Amazon Elastic Block Store (EBS)
    • Description: A block-level storage service that provides persistent storage volumes for EC2 instances. It offers high-performance storage that can be attached to instances and used like a local hard drive.
    • Example: A database server running on an EC2 instance might use an EBS volume to store its data. The EBS volume can be sized according to the database’s storage needs and can be easily detached and attached to another instance for maintenance or scaling purposes.

Database Modules

  • Amazon Relational Database Service (RDS)
    • Description: A managed relational database service that makes it easy to set up, operate, and scale a relational database in the cloud. It supports popular database engines like MySQL, PostgreSQL, and Oracle.
    • Example: An e-commerce website could use RDS to manage its customer and order data. They can create an RDS instance with the appropriate database engine and configure it with the necessary storage and compute resources. The website’s application can then connect to the RDS instance to perform database operations such as inserting, updating, and querying data.
  • Amazon DynamoDB
    • Description: A fully managed NoSQL database service that offers fast and predictable performance with seamless scalability. It is designed for applications that require low-latency access to data.
    • Example: A mobile gaming company might use DynamoDB to store user game progress, leaderboard data, and in-game purchases. The database can handle the high write and read throughput required by the game, and it can scale automatically as the number of users grows.

Networking Modules

  • Amazon Virtual Private Cloud (VPC)
    • Description: Allows you to provision a logically isolated section of the AWS cloud where you can launch AWS resources in a virtual network that you define.
    • Example: A financial institution could create a VPC to host its critical applications and services. They can define subnets, route tables, and security groups within the VPC to ensure secure and isolated networking. For example, they might have a public subnet for web servers that need to be accessible from the internet and a private subnet for database servers that should only be accessible from within the VPC.
  • Amazon Route 53
    • Description: A highly available and scalable Domain Name System (DNS) web service. It translates domain names into IP addresses and routes internet traffic to the appropriate AWS resources.
    • Example: A company with multiple websites and applications can use Route 53 to manage their domain names and DNS records. They can create DNS records to point their domain names to the corresponding EC2 instances, load balancers, or other AWS services. For instance, they can set up an A record to map a domain name to the IP address of a web server hosted on EC2.

Test

1. Different Type of Tests in whole project lifecycle

  • Unit Tests: These are the most granular level of tests. They focus on testing individual units of code, such as a single function, method, or class. Unit tests are usually written by developers and are aimed at verifying that a particular piece of code behaves as expected in isolation. They help in catching bugs early in the development process and make the code easier to maintain.
  • Integration Tests: These tests check how different components or modules of the system work together. They ensure that the interfaces between various parts of the application are functioning correctly. For example, in a software system with a database layer, a business logic layer, and a presentation layer, integration tests would verify that data can flow properly between these layers.
  • System Tests: System tests evaluate the entire system as a whole to ensure that it meets the specified requirements. They simulate real-world scenarios and user interactions to test the system’s functionality, performance, and usability. This includes testing all the components together in the production-like environment.
  • Acceptance Tests: These tests are performed to determine whether the system meets the business requirements and is acceptable to the end-users or stakeholders. Acceptance tests can be user acceptance tests (UAT), where end-users test the system to see if it meets their needs, or contract acceptance tests, which are based on the requirements specified in a contract.
  • Regression Tests: After making changes to the system, such as bug fixes or new feature implementations, regression tests are run to ensure that the existing functionality has not been broken. They are a subset of the overall test suite that focuses on the areas of the system that are likely to be affected by the changes.

2. Unit Test, Mock

  • Unit Test: A unit test is a piece of code that exercises a specific unit of functionality in an isolated way. It provides a set of inputs to the unit under test and verifies that the output is as expected. Unit tests should be fast, independent, and repeatable. For example, in a Java application, a unit test for a method that calculates the sum of two numbers would provide different pairs of numbers as inputs and check if the calculated sum is correct.
  • Mock: In unit testing, a mock is an object that mimics the behavior of a real object, such as a database, a web service, or another component. Mocks are used when the real object is difficult to create, expensive to set up, or not available during testing. For instance, if a unit of code depends on a database call, instead of actually connecting to the database, a mock object can be used to return predefined data. This allows the unit test to focus on testing the logic of the unit under test without being affected by the external dependencies.

3. Testing Rest Api with Rest Assured

Rest Assured is a Java library used for testing RESTful APIs. It simplifies the process of sending HTTP requests to an API and validating the responses.

  • Sending Requests: With Rest Assured, you can easily send different types of HTTP requests like GET, POST, PUT, DELETE, etc. For example, to send a GET request to an API endpoint, you can use code like given().when().get("https://example.com/api/endpoint").then();
  • Validating Responses: You can validate various aspects of the response, such as the status code (e.g., then().statusCode(200); to check if the response has a 200 status code), the headers, and the body. You can use methods to extract data from the response body and perform assertions on it. For instance, if the API returns JSON data, you can use JsonPath expressions in Rest Assured to extract and validate specific fields in the JSON.

4. AUTOMATION TEST

  • BDD - Cucumber - annotations: Behavior-Driven Development (BDD) is an approach that focuses on defining the behavior of the system from the perspective of the stakeholders. Cucumber is a popular tool for implementing BDD in Java (and other languages). Annotations in Cucumber are used to mark different parts of the feature files and step definitions. For example, @Given, @When, @Then are commonly used annotations in step definitions. @Given is used to set up the preconditions, @When describes the action being performed, and @Then is used to define the expected outcome. Feature files written in Gherkin language (a simple syntax used by Cucumber) use these annotations to describe the behavior of the system in a human-readable format.
  • Load Test with JMeter: Apache JMeter is a tool used for load testing web applications, web services, and other types of applications. It can simulate a large number of concurrent users sending requests to the application to measure its performance under load. You can configure JMeter to define the number of threads (simulating users), the ramp-up period (how quickly the users are added), and the duration of the test. It can generate detailed reports on metrics such as response times, throughput, and error rates, helping you identify bottlenecks in the application.
  • Performance tool JProfiler: JProfiler is a powerful Java profiling tool used for performance analysis. It can help you identify performance issues in your Java applications by analyzing memory usage, CPU utilization, and thread behavior. It allows you to take snapshots of the application’s state at different times, trace method calls, and find memory leaks. You can use JProfiler to optimize your code by identifying methods that consume a lot of resources and improving their performance.
  • AB Test: AB testing is a method of comparing two versions (A and B) of a web page, application feature, or marketing campaign to determine which one performs better. In AB testing, a random subset of users is shown version A, and another random subset is shown version B. Metrics such as click-through rates, conversion rates, or user engagement are then measured for each version. Based on the results, you can decide which version to implement permanently. AB testing is often used in web development and digital marketing to make data-driven decisions about changes to the product or service.

Future&CompletableFuture

在 Java 里,FutureCompletableFuture 都用于处理异步操作,不过 CompletableFuture 是 Java 8 引入的,它在 Future 的基础上做了扩展,功能更强大。下面从多个方面对它们进行对比,并给出示例和表格。

Future

  • The Future interface represents the result of an asynchronous computation. It is used in conjunction with ExecutorService when you submit a Callable task.
  • A Callable is similar to a Runnable, but it can return a result and throw an exception.
  • When you submit a Callable to an ExecutorService, it returns a Future object, which you can use to check if the computation is done, wait for the computation to complete, and retrieve the result of the computation.
  • Example of using Future with ExecutorService:

    import java.util.concurrent.Callable;
    import java.util.concurrent.ExecutionException;
    import java.util.concurrent.ExecutorService;
    import java.util.concurrent.Executors;
    import java.util.concurrent.Future;

    public class FutureExample {
    public static void main(String[] args) {
    ExecutorService executorService = Executors.newFixedThreadPool(5);
    // Submits a Callable task
    Future<Integer> future = executorService.submit(new Callable<Integer>() {
    @Override
    public Integer call() throws Exception {
    // Simulates some computation
    Thread.sleep(2000);
    return 42;
    }
    });
    try {
    // Waits for the task to complete and gets the result
    Integer result = future.get();
    System.out.println("Result: " + result);
    } catch (InterruptedException | ExecutionException e) {
    e.printStackTrace();
    }
    executorService.shutdown();
    }
    }

    In the code above:

    • executorService.submit() is used to submit a Callable<Integer> task. The Callable task simulates some computation (in this case, it sleeps for 2 seconds and then returns the value 42).
    • future.get() blocks the calling thread until the computation is completed and returns the result. If the computation throws an exception, it will be wrapped in an ExecutionException.
    • InterruptedException is thrown if the waiting thread is interrupted while waiting for the result.

CompletableFuture

  • Enhanced Functionality:

    • CompletableFuture is introduced in Java 8. It implements Future and provides additional functionality for chaining asynchronous operations, combining multiple futures, and handling exceptions.
    • It allows you to perform actions upon completion, combine multiple futures, and transform results.

      import java.util.concurrent.CompletableFuture;
      import java.util.concurrent.ExecutionException;

      public class CompletableFutureExample {
      public static void main(String[] args) throws ExecutionException, InterruptedException {
      CompletableFuture<Integer> future = CompletableFuture.supplyAsync(() -> {
      // Simulate a long-running task
      try {
      Thread.sleep(2000);
      } catch (InterruptedException e) {
      e.printStackTrace();
      }
      return 42;
      });

      // Do other work while the future is being computed
      System.out.println("Doing other work...");

      // Chain another action upon completion
      CompletableFuture<String> resultFuture = future.thenApply(result -> "Result: " + result);

      // Block until the final result is available
      String result = resultFuture.get();
      System.out.println(result);
      /* result print:
      Doing other work...
      Result: 42
      */
      }
      }
    • Explanation:

      1. CompletableFuture.supplyAsync(() -> {... });: Creates a CompletableFuture that runs the given task asynchronously.
      2. future.thenApply(result -> "Result: " + result);: Chains another action to the CompletableFuture.
      3. resultFuture.get();: Blocks until the final result is available.

对比分析

1. 基本功能

  • FutureFuture 代表一个异步计算的结果。它提供了检查计算是否完成、等待计算完成以及获取计算结果的方法。不过,它缺乏对异步操作的进一步控制和组合能力。
  • CompletableFutureCompletableFuture 不仅具备 Future 的基本功能,还支持链式调用和组合多个异步操作,能轻松处理复杂的异步任务。

2. 异步任务的创建

  • Future:通常借助线程池提交任务来创建 Future 对象。
  • CompletableFuture:提供了多种静态方法来创建,例如 runAsyncsupplyAsync 等。

3. 错误处理

  • FutureFuture 本身没有内置的错误处理机制,需要手动捕获异常。
  • CompletableFuture:有专门的 exceptionally 方法来处理异常,还可以使用 handle 方法同时处理正常结果和异常。

4. 组合多个异步任务

  • Future:组合多个 Future 任务较为复杂,需要手动管理线程和结果。
  • CompletableFuture:提供了丰富的方法来组合多个异步任务,比如 thenComposethenCombine 等。

对比表格

对比项 Future CompletableFuture
基本功能 代表异步计算的结果,提供检查计算是否完成、等待计算完成以及获取结果的方法 具备 Future 的基本功能,还支持链式调用和组合多个异步操作
异步任务创建 通常通过线程池提交任务创建 提供多种静态方法创建,如 runAsyncsupplyAsync
错误处理 无内置错误处理机制,需手动捕获异常 exceptionallyhandle 方法处理异常
组合多个异步任务 组合复杂,需手动管理线程和结果 提供丰富方法组合,如 thenComposethenCombine
代码可读性 代码复杂,可读性差 支持链式调用,代码简洁易读

通过上述对比和示例可知,CompletableFuture 在功能和易用性上明显优于 Future,更适合处理复杂的异步任务。

Session7-SQL

  • What is data modeling? Why do we need it? When would you need it?
  • What is primary key? How is it different from unique key?
  • What is normalization? Why do you need to normalize?
  • What does data redundancy mean? Can you give an example of each?
  • What is database integrity? Why do you need it?
  • What are joins and explain different types of joins in detail.
  • Explain indexes and why are they needed?
  • If we have 1B data in our relational database and we do not want to fetch all at once. What are the ways that we can partition the data rows?

Explain clustered and non-clustered index and their differences.

1. Clustered Index

Definition

A clustered index determines the physical order of data storage in a table. In other words, the rows of the table are physically arranged on disk in the order of the clustered index key. A table can have only one clustered index because there can be only one physical ordering of the data rows.

How it Works

  • Index Structure: The clustered index is often implemented as a B - tree data structure. The leaf nodes of the B - tree contain the actual data rows of the table, sorted according to the index key.
  • Data Retrieval: When you query data using the columns in the clustered index, the database can quickly locate the relevant rows because they are physically stored in the order of the index. For example, if you have a Customers table with a clustered index on the CustomerID column, and you query for a specific CustomerID, the database can efficiently navigate through the B - tree to find the corresponding row.

Example

-- Create a table with a clustered index on the ID column
CREATE TABLE Products (
ProductID INT PRIMARY KEY CLUSTERED,
ProductName VARCHAR(100),
Price DECIMAL(10, 2)
);

In this example, the ProductID column is the clustered index. The rows in the Products table will be physically sorted by the ProductID value.

2. Non - Clustered Index

Definition

A non - clustered index is a separate structure from the actual data rows. It contains a copy of the indexed columns and a pointer to the location of the corresponding data row in the table. A table can have multiple non - clustered indexes.

How it Works

  • Index Structure: Similar to a clustered index, a non - clustered index is also typically implemented as a B - tree. However, the leaf nodes of the non - clustered index do not contain the actual data rows but rather pointers to the data rows in the table.
  • Data Retrieval: When you query data using the columns in a non - clustered index, the database first searches the non - clustered index to find the pointers to the relevant data rows. Then it uses these pointers to access the actual data rows in the table. This additional step of accessing the data rows can make non - clustered index lookups slightly slower than clustered index lookups for large datasets.

Example

-- Create a table
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
OrderDate DATE
);

-- Create a non - clustered index on the CustomerID column
CREATE NONCLUSTERED INDEX idx_CustomerID ON Orders (CustomerID);

In this example, the idx_CustomerID is a non - clustered index on the CustomerID column. The index stores the CustomerID values and pointers to the corresponding rows in the Orders table.

3. Differences between Clustered and Non - Clustered Indexes

Physical Order of Data

  • Clustered Index: Determines the physical order of data storage in the table. The data rows are physically sorted according to the clustered index key.
  • Non - Clustered Index: Does not affect the physical order of data in the table. It is a separate structure that points to the data rows.

Number of Indexes per Table

  • Clustered Index: A table can have only one clustered index because there can be only one physical ordering of the data.
  • Non - Clustered Index: A table can have multiple non - clustered indexes. You can create non - clustered indexes on different columns or combinations of columns to improve query performance for various types of queries.

Storage Space

  • Clustered Index: Since it stores the actual data rows, it generally requires more storage space compared to a non - clustered index.
  • Non - Clustered Index: Stores only the indexed columns and pointers to the data rows, so it usually requires less storage space.

Query Performance

  • Clustered Index: Is very efficient for range queries (e.g., retrieving all rows where the index value is between a certain range) because the data is physically sorted. It also has an advantage for queries that return a large number of rows.
  • Non - Clustered Index: Is useful for queries that filter on a small subset of data using the indexed columns. However, for queries that need to access a large number of rows, the additional step of following the pointers to the data rows can make it slower than using a clustered index.

Insert, Update, and Delete Operations

  • Clustered Index: Inserting, updating, or deleting rows can be more expensive because it may require re - arranging the physical order of the data on disk.
  • Non - Clustered Index: These operations are generally less expensive because they only involve updating the non - clustered index structure and the pointers, without affecting the physical order of the data.

What are normal forms

In the context of databases, “NF” usually stands for “Normal Form”. Normal forms are used in database design to organize data in a way that reduces data redundancy, improves data integrity, and makes the database more efficient and easier to manage. Some of the commonly known normal forms are:

  • First Normal Form (1NF): A relation is in 1NF if it has atomic values, meaning that each cell in the table contains only a single value and not a set of values. For example, a table where a column stores multiple phone numbers separated by commas would not be in 1NF.
  • Second Normal Form (2NF): A relation is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the primary key. This means that no non-key attribute should depend only on a part of the primary key in case of a composite primary key.
  • Third Normal Form (3NF): A relation is in 3NF if it is in 2NF and there is no transitive dependency of non-key attributes on the primary key. That is, a non-key attribute should not depend on another non-key attribute.
  • Boyce-Codd Normal Form (BCNF): BCNF is a stronger version of 3NF. A relation is in BCNF if for every functional dependency X → Y, X is a superkey. In other words, every determinant must be a candidate key.
  • Fourth Normal Form (4NF): A relation is in 4NF if it is in BCNF and there are no non-trivial multivalued dependencies.

1. Examples of Normalization

First Normal Form (1NF)

Original Table (Not in 1NF):
Suppose we have a Students table that stores information about students and their hobbies.

Student ID Student Name Hobbies
1 John Reading, Painting
2 Jane Singing, Dancing

The Hobbies column contains multiple values separated by commas, which violates 1NF.

Converted to 1NF:
We create a new table structure.
Students Table:

Student ID Student Name
1 John
2 Jane

StudentHobbies Table:

Student ID Hobby
1 Reading
1 Painting
2 Singing
2 Dancing

Second Normal Form (2NF)

Original Table (Violating 2NF):
Consider an Orders table with a composite primary key (Order ID, Product ID).

Order ID Product ID Product Name Order Quantity
1 101 Laptop 2
1 102 Mouse 3
2 101 Laptop 1

The Product Name depends only on the Product ID (part of the composite primary key), violating 2NF.

Converted to 2NF:
Products Table:

Product ID Product Name
101 Laptop
102 Mouse

OrderDetails Table:

Order ID Product ID Order Quantity
1 101 2
1 102 3
2 101 1

Third Normal Form (3NF)

Original Table (Violating 3NF):
Let’s have an Employees table.

Employee ID Department ID Department Name Employee Salary
1 1 IT 5000
2 1 IT 6000
3 2 HR 4500

The Department Name is transitively dependent on the Employee ID through the Department ID, violating 3NF.

Converted to 3NF:
Departments Table:

Department ID Department Name
1 IT
2 HR

Employees Table:

Employee ID Department ID Employee Salary
1 1 5000
2 1 6000
3 2 4500

2. Examples of Database Integrity

Entity Integrity

  • Explanation: Ensures that each row in a table is uniquely identifiable, usually through a primary key.
  • Example: In a Customers table, the Customer ID is set as the primary key.
    CREATE TABLE Customers (
    Customer ID INT PRIMARY KEY,
    Customer Name VARCHAR(100),
    Email VARCHAR(100)
    );

If you try to insert a new row with an existing Customer ID, the database will reject the insert operation because it violates entity integrity.

Referential Integrity

  • Explanation: Maintains the consistency between related tables. A foreign key in one table must match a primary key value in another table.
  • Example: Consider a Orders table and a Customers table. The Orders table has a foreign key Customer ID that references the Customer ID in the Customers table.
    CREATE TABLE Customers (
    Customer ID INT PRIMARY KEY,
    Customer Name VARCHAR(100)
    );

    CREATE TABLE Orders (
    Order ID INT PRIMARY KEY,
    Customer ID INT,
    Order Date DATE,
    FOREIGN KEY (Customer ID) REFERENCES Customers(Customer ID)
    );

If you try to insert an order with a Customer ID that does not exist in the Customers table, the database will not allow it due to referential integrity.

Domain Integrity

  • Explanation: Ensures that the data entered into a column falls within an acceptable range of values.
  • Example: In a Products table, the Price column should only accept positive values.
    CREATE TABLE Products (
    Product ID INT PRIMARY KEY,
    Product Name VARCHAR(100),
    Price DECIMAL(10, 2) CHECK (Price > 0)
    );

If you try to insert a product with a negative price, the database will reject the insert because it violates domain integrity.

How do you represent a multi-valued attribute in a database?

A multi - valued attribute is an attribute that can have multiple values for a single entity. Here are the common ways to represent multi - valued attributes in different types of databases:

Relational Databases

1. Using a Separate Table (Normalization Approach)

This is the most common and recommended method in relational databases as it adheres to the principles of database normalization.

Steps:

  • Identify the Entities and Attributes: Suppose you have an Employees entity with a multi - valued attribute Skills. An employee can have multiple skills, so the Skills attribute is multi - valued.
  • Create a New Table: Create a new table to store the multi - valued data. This table will have a foreign key that references the primary key of the main entity table.
  • Define the Schema:

    -- Create the Employees table
    CREATE TABLE Employees (
    employee_id INT PRIMARY KEY AUTO_INCREMENT,
    employee_name VARCHAR(100)
    );

    -- Create the Skills table
    CREATE TABLE Skills (
    skill_id INT PRIMARY KEY AUTO_INCREMENT,
    employee_id INT,
    skill_name VARCHAR(50),
    FOREIGN KEY (employee_id) REFERENCES Employees(employee_id)
    );
  • Insert and Query Data:

    -- Insert an employee
    INSERT INTO Employees (employee_name) VALUES ('John Doe');

    -- Insert skills for the employee
    INSERT INTO Skills (employee_id, skill_name) VALUES (1, 'Java');
    INSERT INTO Skills (employee_id, skill_name) VALUES (1, 'Python');

    -- Query all skills of an employee
    SELECT skill_name
    FROM Skills
    WHERE employee_id = 1;

2. Using Delimited Lists (Denormalization Approach)

In some cases, for simplicity or performance reasons, you may choose to use delimited lists to represent multi - valued attributes.

Steps:

  • Modify the Main Table: Instead of creating a separate table, you add a single column to the main table and store multiple values separated by a delimiter (e.g., comma).

    -- Create the Employees table with a multi - valued attribute as a delimited list
    CREATE TABLE Employees (
    employee_id INT PRIMARY KEY AUTO_INCREMENT,
    employee_name VARCHAR(100),
    skills VARCHAR(200)
    );
  • Insert and Query Data:

    -- Insert an employee with skills
    INSERT INTO Employees (employee_name, skills) VALUES ('John Doe', 'Java,Python');

    -- Query employees with a specific skill
    SELECT *
    FROM Employees
    WHERE skills LIKE '%Java%';

However, this approach has several drawbacks. It violates the first normal form of database normalization, making it difficult to perform data manipulation and queries, and it can lead to data integrity issues.

Non - Relational Databases

1. Document Databases (e.g., MongoDB)

In document databases, multi - valued attributes can be easily represented as arrays within a document.

Steps:

  • Define the Document Structure: Create a collection and define the document structure to include an array for the multi - valued attribute.

    // Insert a document in the Employees collection
    db.employees.insertOne({
    employee_name: 'John Doe',
    skills: ['Java', 'Python']
    });
  • Query Data:

    // Query employees with a specific skill
    db.employees.find({ skills: 'Java' });

2. Graph Databases (e.g., Neo4j)

In graph databases, multi - valued attributes can be represented as relationships between nodes.

Steps:

  • Create Nodes and Relationships: Create nodes for the main entity and the values of the multi - valued attribute, and then create relationships between them.

    // Create an employee node
    CREATE (:Employee {name: 'John Doe'})
    // Create skill nodes
    CREATE (:Skill {name: 'Java'})
    CREATE (:Skill {name: 'Python'})
    // Create relationships between the employee and skills
    MATCH (e:Employee {name: 'John Doe'}), (s1:Skill {name: 'Java'}), (s2:Skill {name: 'Python'})
    CREATE (e)-[:HAS_SKILL]->(s1)
    CREATE (e)-[:HAS_SKILL]->(s2);
  • Query Data:

    // Query all skills of an employee
    MATCH (e:Employee {name: 'John Doe'})-[:HAS_SKILL]->(s:Skill)
    RETURN s.name;

How do you represent a many-to-many relationship in database?

Here are the common ways to represent a many - to - many relationship in a database:

1. Using a Junction Table (Associative Table)

This is the most prevalent method in relational databases.

Suppose you have two entities that have a many - to - many relationship. For example, in a school database, “Students” and “Courses”. A student can enroll in multiple courses, and a course can have multiple students.

Step 2: Create the junction table

The junction table contains at least two foreign keys, each referencing the primary key of one of the related tables.

  • Table creation in SQL (for MySQL):
    -- Create the Students table
    CREATE TABLE Students (
    student_id INT PRIMARY KEY AUTO_INCREMENT,
    student_name VARCHAR(100)
    );

    -- Create the Courses table
    CREATE TABLE Courses (
    course_id INT PRIMARY KEY AUTO_INCREMENT,
    course_name VARCHAR(100)
    );

    -- Create the junction table (Enrollments)
    CREATE TABLE Enrollments (
    student_id INT,
    course_id INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES Students(student_id),
    FOREIGN KEY (course_id) REFERENCES Courses(course_id)
    );

In this example, the Enrollments table is the junction table. The combination of student_id and course_id forms a composite primary key, which ensures that each enrollment (a relationship between a student and a course) is unique.

Step 3: Insert and query data

  • Inserting data:

    -- Insert a student
    INSERT INTO Students (student_name) VALUES ('John Doe');
    -- Insert a course
    INSERT INTO Courses (course_name) VALUES ('Mathematics');
    -- Record the enrollment
    INSERT INTO Enrollments (student_id, course_id) VALUES (1, 1);
  • Querying data: To find all courses a student is enrolled in, or all students enrolled in a course, you can use JOIN operations.

    -- Find all courses John Doe is enrolled in
    SELECT Courses.course_name
    FROM Students
    JOIN Enrollments ON Students.student_id = Enrollments.student_id
    JOIN Courses ON Enrollments.course_id = Courses.course_id
    WHERE Students.student_name = 'John Doe';

2. In Non - Relational Databases

Graph Databases

  • In graph databases like Neo4j, a many - to - many relationship is represented by nodes and relationships. Each entity is a node, and the relationship between them is an edge.
  • For example, you can create Student nodes and Course nodes. Then, you can create a ENROLLED_IN relationship between the Student and Course nodes.
    // Create a student node
    CREATE (:Student {name: 'John Doe'})
    // Create a course node
    CREATE (:Course {name: 'Mathematics'})
    // Create the enrollment relationship
    MATCH (s:Student {name: 'John Doe'}), (c:Course {name: 'Mathematics'})
    CREATE (s)-[:ENROLLED_IN]->(c);

Document Databases

  • In document databases such as MongoDB, you can use arrays to represent many - to - many relationships in a denormalized way. For example, in the students collection, each student document can have an array of course IDs, and in the courses collection, each course document can have an array of student IDs. However, this approach can lead to data duplication and potential consistency issues.
    // Insert a student document
    db.students.insertOne({
    name: 'John Doe',
    courses: [ObjectId("1234567890abcdef12345678"), ObjectId("234567890abcdef12345678")]
    });
    // Insert a course document
    db.courses.insertOne({
    name: 'Mathematics',
    students: [ObjectId("abcdef1234567890abcdef12"), ObjectId("bcdef1234567890abcdef12")]
    });

Session15-TRANSACTION JPA

  1. What is “Offline Transaction”?
  2. How do we usually perform Transaction Management in JDBC?
  3. What is Database Transaction?
  4. What are entity states defined in Hibernate / JPA?
  5. How can we transfer the entity between different states?
  6. What are differences between save, persist?
  7. What are differences between update, merge and saveOrUpdate?
  8. How do you use elasticSearch in your java application

1. What is “Offline Transaction”?

An offline transaction in the context of databases is a set of operations on data that occur without an immediate, real - time connection to the database server. The operations are carried out on a local copy of the data, and the changes are later synchronized with the main database.

Example:

  • Mobile Banking App: A user opens a mobile banking app on their smartphone while on an airplane (no internet connection). They can view their account balance, transaction history which is stored locally. They can also initiate a new fund transfer. The app records this transfer request in a local database on the phone. Once the plane lands and the phone connects to the internet, the app synchronizes with the bank’s central database, uploading the new transfer request and downloading any new account updates.
  • Field Salesperson: A salesperson visits clients in an area with poor network coverage. Using a tablet, they access a local copy of the customer database. They add new customer details and record sales orders. Later, when they get back to an area with a network, the tablet syncs the new data with the company’s central database.

2. How do we usually perform Transaction Management in JDBC?

In JDBC (Java Database Connectivity), transaction management involves the following steps:

Step 1: Disable Auto - Commit Mode
By default, JDBC operates in auto - commit mode where each SQL statement is treated as a separate transaction. To group multiple statements into a single transaction, we need to disable auto - commit.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.sql.Statement;

public class JDBCTransactionExample {
public static void main(String[] args) {
Connection connection = null;
try {
// Establish a connection
connection = DriverManager.getConnection("jdbc:mysql://localhost:3306/mydb", "user", "password");
// Disable auto - commit
connection.setAutoCommit(false);

Statement statement = connection.createStatement();
// Execute SQL statements
statement.executeUpdate("INSERT INTO employees (name, salary) VALUES ('John', 5000)");
statement.executeUpdate("UPDATE departments SET budget = budget - 5000 WHERE dept_name = 'IT'");

// Commit the transaction
connection.commit();
} catch (SQLException e) {
try {
if (connection != null) {
// Rollback the transaction in case of an error
connection.rollback();
}
} catch (SQLException ex) {
ex.printStackTrace();
}
e.printStackTrace();
} finally {
try {
if (connection != null) {
connection.close();
}
} catch (SQLException e) {
e.printStackTrace();
}
}
}
}

Explanation:

  • connection.setAutoCommit(false): Disables auto - commit mode so that statements are grouped into a single transaction.
  • connection.commit(): Commits all the statements in the transaction if everything goes well.
  • connection.rollback(): Rolls back all the statements in the transaction if an error occurs.

3. What is Database Transaction?

A database transaction is a sequence of one or more SQL statements that are treated as a single unit of work. It must satisfy the ACID properties:

  • Atomicity: Either all the statements in the transaction are executed successfully, or none of them are. For example, in a bank transfer, if you transfer money from one account to another, either both the debit from the source account and the credit to the destination account happen, or neither does.
  • Consistency: The transaction takes the database from one consistent state to another. For instance, if a rule in the database states that the total balance of all accounts should always be the same, a transaction should maintain this consistency.
  • Isolation: Transactions are isolated from each other. One transaction should not be affected by the intermediate states of other concurrent transactions. For example, if two users are trying to transfer money at the same time, their transactions should not interfere with each other.
  • Durability: Once a transaction is committed, its changes are permanent and will survive any subsequent system failures.

4. What are entity states defined in Hibernate / JPA?

In Hibernate and JPA (Java Persistence API), entities can be in one of the following states:

  • Transient: An entity is transient when it is created using the new keyword and has not been associated with a persistence context. It has no corresponding row in the database.

    // Transient entity
    Employee employee = new Employee();
    employee.setName("Jane");
  • Persistent: A persistent entity is associated with a persistence context and has a corresponding row in the database. Any changes made to a persistent entity will be automatically synchronized with the database when the transaction is committed.

    EntityManager entityManager = entityManagerFactory.createEntityManager();
    entityManager.getTransaction().begin();
    Employee employee = entityManager.find(Employee.class, 1L);
    // Now the employee is in persistent state
  • Detached: A detached entity was once persistent but is no longer associated with a persistence context. It still has a corresponding row in the database, but changes made to it will not be automatically synchronized.

    entityManager.getTransaction().commit();
    entityManager.close();
    // Now the employee is in detached state
  • Removed: An entity is in the removed state when it has been marked for deletion from the database. Once the transaction is committed, the corresponding row in the database will be deleted.

    entityManager.getTransaction().begin();
    Employee employee = entityManager.find(Employee.class, 1L);
    entityManager.remove(employee);
    // Now the employee is in removed state

5. How can we transfer the entity between different states?

  • Transient to Persistent: Use methods like persist() or save() in Hibernate. In JPA, you can use EntityManager.persist().

    EntityManager entityManager = entityManagerFactory.createEntityManager();
    entityManager.getTransaction().begin();
    Employee employee = new Employee();
    employee.setName("Tom");
    entityManager.persist(employee);
    // Now the employee is in persistent state
  • Persistent to Detached: Closing the EntityManager or clearing the persistence context will make a persistent entity detached.

    entityManager.getTransaction().commit();
    entityManager.close();
    // The previously persistent entity is now detached
  • Detached to Persistent: Use the merge() method in JPA.

    EntityManager newEntityManager = entityManagerFactory.createEntityManager();
    newEntityManager.getTransaction().begin();
    Employee detachedEmployee = getDetachedEmployee();
    Employee persistentEmployee = newEntityManager.merge(detachedEmployee);
    // Now the entity is back in persistent state
  • Persistent/Detached to Removed: Use the remove() method in JPA.

    entityManager.getTransaction().begin();
    Employee employee = entityManager.find(Employee.class, 1L);
    entityManager.remove(employee);
    // Now the employee is in removed state

6. What are differences between save, persist?

  • save() (Hibernate - specific):

    • Returns the generated identifier immediately. It can be used to insert a new entity into the database. If the entity is already persistent, it may throw an exception.
      Session session = sessionFactory.openSession();
      Transaction transaction = session.beginTransaction();
      Employee employee = new Employee();
      employee.setName("Alice");
      Serializable id = session.save(employee);
      transaction.commit();
      session.close();
  • persist() (JPA - standard):

    • Does not guarantee that the identifier will be assigned immediately. It is used to make a transient entity persistent. If the entity is already persistent, it will have no effect.
      EntityManager entityManager = entityManagerFactory.createEntityManager();
      entityManager.getTransaction().begin();
      Employee employee = new Employee();
      employee.setName("Bob");
      entityManager.persist(employee);
      entityManager.getTransaction().commit();
      entityManager.close();

7. What are differences between update, merge and saveOrUpdate?

  • update() (Hibernate - specific):

    • Used to make a detached entity persistent. If the entity is already persistent, it may throw an exception. It directly updates the database row corresponding to the entity.
      Session session = sessionFactory.openSession();
      Transaction transaction = session.beginTransaction();
      Employee detachedEmployee = getDetachedEmployee();
      session.update(detachedEmployee);
      transaction.commit();
      session.close();
  • merge() (JPA - standard):

    • Creates a copy of the detached entity, makes the copy persistent, and returns the persistent copy. The original detached entity remains detached. It can handle both transient and detached entities.
      EntityManager entityManager = entityManagerFactory.createEntityManager();
      entityManager.getTransaction().begin();
      Employee detachedEmployee = getDetachedEmployee();
      Employee mergedEmployee = entityManager.merge(detachedEmployee);
      entityManager.getTransaction().commit();
      entityManager.close();
  • saveOrUpdate() (Hibernate - specific):

    • Checks if the entity has an identifier. If it does not have an identifier, it acts like save(). If it has an identifier, it acts like update().
      Session session = sessionFactory.openSession();
      Transaction transaction = session.beginTransaction();
      Employee employee = new Employee();
      session.saveOrUpdate(employee);
      transaction.commit();
      session.close();

8. How do you use Elasticsearch in your Java application?

To use Elasticsearch in a Java application, you can follow these steps:

Step 1: Add Dependencies
If you are using Maven, add the following dependencies to your pom.xml:

<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.17.3</version>
</dependency>

Step 2: Create a Client

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;

public class ElasticsearchClientExample {
public static void main(String[] args) {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));
// Use the client for operations
try {
// Perform operations like indexing, searching, etc.
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}

Step 3: Index a Document

import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.common.xcontent.XContentType;

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

public class ElasticsearchIndexExample {
public static void main(String[] args) {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));

Map<String, Object> jsonMap = new HashMap<>();
jsonMap.put("title", "Elasticsearch Tutorial");
jsonMap.put("content", "Learn how to use Elasticsearch in Java");

IndexRequest request = new IndexRequest("my_index")
.id("1")
.source(jsonMap, XContentType.JSON);

try {
IndexResponse indexResponse = client.index(request, RequestOptions.DEFAULT);
System.out.println(indexResponse);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

Step 4: Search for Documents

import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.io.IOException;

public class ElasticsearchSearchExample {
public static void main(String[] args) {
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(new HttpHost("localhost", 9200, "http")));

SearchRequest searchRequest = new SearchRequest("my_index");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("title", "Elasticsearch"));
searchRequest.source(searchSourceBuilder);

try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
System.out.println(searchResponse);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}

Session17-UNIT TEST

  1. Explain and name some methods that you used in JUnit.
  2. Explain and name some annotations that you used in JUnit.
  3. What is Mockito and the usage of it?

1. Commonly - Used Methods in JUnit

assertEquals()

  • Explanation: This method is used to verify if two values are equal. It is very useful when you want to check if the result of a method call in your code under test matches the expected result.
  • Example:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertEquals;

    public class CalculatorTest {
    @Test
    public void testAddition() {
    Calculator calculator = new Calculator();
    int result = calculator.add(2, 3);
    assertEquals(5, result);
    }
    }

    class Calculator {
    public int add(int a, int b) {
    return a + b;
    }
    }

assertTrue() and assertFalse()

  • Explanation: assertTrue() is used to verify if a given condition is true, and assertFalse() is used to verify if a condition is false. These are handy when you want to check the truth - value of a boolean expression returned by a method.
  • Example:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertTrue;
    import static org.junit.jupiter.api.Assertions.assertFalse;

    public class StringUtilTest {
    @Test
    public void testIsEmpty() {
    StringUtil stringUtil = new StringUtil();
    assertTrue(stringUtil.isEmpty(""));
    assertFalse(stringUtil.isEmpty("Hello"));
    }
    }

    class StringUtil {
    public boolean isEmpty(String str) {
    return str == null || str.length() == 0;
    }
    }

assertNull() and assertNotNull()

  • Explanation: assertNull() checks if an object reference is null, while assertNotNull() checks if an object reference is not null. They are useful when you need to ensure that a method returns or does not return a null value.
  • Example:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertNull;
    import static org.junit.jupiter.api.Assertions.assertNotNull;

    public class ObjectFactoryTest {
    @Test
    public void testCreateObject() {
    ObjectFactory objectFactory = new ObjectFactory();
    Object obj = objectFactory.createObject();
    assertNotNull(obj);
    Object nullObj = objectFactory.createNullObject();
    assertNull(nullObj);
    }
    }

    class ObjectFactory {
    public Object createObject() {
    return new Object();
    }

    public Object createNullObject() {
    return null;
    }
    }

2. Commonly - Used Annotations in JUnit

@Test

  • Explanation: This annotation is used to mark a method as a test method. JUnit will execute all methods annotated with @Test when running the test class.
  • Example:
    import org.junit.jupiter.api.Test;

    public class SimpleTest {
    @Test
    public void testSomething() {
    // Test logic here
    }
    }

@BeforeEach

  • Explanation: Methods annotated with @BeforeEach are executed before each test method. This is useful for setting up the test environment, such as initializing objects or variables that are needed for each test.
  • Example:
    import org.junit.jupiter.api.BeforeEach;
    import org.junit.jupiter.api.Test;

    public class UserServiceTest {
    private UserService userService;

    @BeforeEach
    public void setUp() {
    userService = new UserService();
    }

    @Test
    public void testCreateUser() {
    // Use userService for testing
    }
    }

    class UserService {
    // Class implementation
    }

@AfterEach

  • Explanation: Methods annotated with @AfterEach are executed after each test method. This is used for cleaning up resources, such as closing database connections or releasing memory.
  • Example:
    import org.junit.jupiter.api.AfterEach;
    import org.junit.jupiter.api.Test;
    import java.io.File;
    import java.io.FileWriter;
    import java.io.IOException;

    public class FileServiceTest {
    private File tempFile;

    @Test
    public void testWriteToFile() throws IOException {
    tempFile = new File("temp.txt");
    FileWriter writer = new FileWriter(tempFile);
    writer.write("Test data");
    writer.close();
    }

    @AfterEach
    public void tearDown() {
    if (tempFile != null && tempFile.exists()) {
    tempFile.delete();
    }
    }
    }

@BeforeAll and @AfterAll

  • Explanation: @BeforeAll is used to annotate a static method that will be executed once before all the test methods in the class. @AfterAll is used to annotate a static method that will be executed once after all the test methods in the class. These are useful for performing expensive setup and cleanup operations, like starting and stopping a database server.
  • Example:
    import org.junit.jupiter.api.BeforeAll;
    import org.junit.jupiter.api.AfterAll;
    import org.junit.jupiter.api.Test;

    public class DatabaseServiceTest {
    private static DatabaseService databaseService;

    @BeforeAll
    public static void setUpAll() {
    databaseService = new DatabaseService();
    databaseService.startDatabase();
    }

    @Test
    public void testQueryDatabase() {
    // Test database query
    }

    @AfterAll
    public static void tearDownAll() {
    databaseService.stopDatabase();
    }
    }

    class DatabaseService {
    public void startDatabase() {
    // Start database logic
    }

    public void stopDatabase() {
    // Stop database logic
    }
    }

3. What is Mockito and its Usage

Definition

Mockito is a popular open - source testing framework for Java that allows you to create mock objects. Mock objects are simulated objects that mimic the behavior of real objects in a controlled way. They are used to isolate the code under test from its dependencies, making unit tests more reliable and faster.

Common Usages

Creating Mock Objects

  • You can use Mockito.mock() to create a mock object of a class or an interface.
    import org.junit.jupiter.api.Test;
    import static org.mockito.Mockito.mock;

    public class MockitoExample {
    @Test
    public void testMockCreation() {
    MyInterface myMock = mock(MyInterface.class);
    // Now myMock is a mock object of MyInterface
    }
    }

    interface MyInterface {
    void doSomething();
    }

Stubbing Methods

  • Stubbing means defining the behavior of a method on a mock object. You can use methods like when() and thenReturn() to stub methods.
    import org.junit.jupiter.api.Test;
    import static org.mockito.Mockito.mock;
    import static org.mockito.Mockito.when;

    public class StubbingExample {
    @Test
    public void testStubbing() {
    MyService myService = mock(MyService.class);
    when(myService.getResult()).thenReturn(10);
    int result = myService.getResult();
    // result will be 10
    }
    }

    class MyService {
    public int getResult() {
    return 0;
    }
    }

Verifying Method Calls

  • You can use Mockito.verify() to check if a method on a mock object has been called with specific arguments.
    import org.junit.jupiter.api.Test;
    import static org.mockito.Mockito.mock;
    import static org.mockito.Mockito.verify;

    public class VerificationExample {
    @Test
    public void testVerification() {
    MyInterface myMock = mock(MyInterface.class);
    myMock.doSomething();
    verify(myMock).doSomething();
    }
    }

    interface MyInterface {
    void doSomething();
    }

@InjectMocks和@Mock

@InjectMocks@Mock 是在使用 Mockito 框架进行单元测试时常用的注解,二者在功能和使用场景上存在明显差异,下面为你详细介绍:

功能用途
  • @Mock:此注解的作用是创建一个模拟对象。模拟对象可以用来模拟真实对象的行为,它会接管真实对象的所有方法调用,让你能自由地设定方法的返回值、抛出异常等。在单元测试里,当你想要隔离外部依赖时,可使用 @Mock 来创建这些依赖的模拟对象,这样就能专注于测试目标对象的逻辑,而不受外部依赖的影响。
  • @InjectMocks:该注解用于创建一个真实对象,并且会尝试将使用 @Mock 注解创建的模拟对象注入到这个真实对象里。在测试时,若目标对象依赖于其他对象,你可以使用 @InjectMocks 来创建目标对象,再用 @Mock 来创建其依赖对象,随后 Mockito 会自动把这些模拟对象注入到目标对象中。
应用场景
  • @Mock:适用于需要对某个外部依赖进行模拟的场景。例如,当目标对象依赖于数据库访问对象、网络服务对象等,而你不希望在测试时实际访问数据库或网络时,就可以使用 @Mock 来模拟这些依赖对象的行为。
  • @InjectMocks:适用于测试目标对象的整体逻辑,且该对象依赖于多个其他对象的场景。通过 @InjectMocks 创建目标对象,再用 @Mock 创建其依赖对象,可模拟出目标对象的运行环境,进而测试其在不同依赖行为下的逻辑表现。
示例代码
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.InjectMocks;
import org.mockito.Mock;
import org.mockito.junit.jupiter.MockitoExtension;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.mockito.Mockito.when;

// 定义一个依赖类
class Dependency {
public String getData() {
return "real data";
}
}

// 定义一个需要测试的类,它依赖于 Dependency 类
class MyService {
private Dependency dependency;

public MyService(Dependency dependency) {
this.dependency = dependency;
}

public String processData() {
return "Processed: " + dependency.getData();
}
}

@ExtendWith(MockitoExtension.class)
public class MockitoAnnotationsExample {
// 创建 Dependency 类的模拟对象
@Mock
private Dependency dependency;

// 创建 MyService 类的实例,并将模拟的 Dependency 对象注入其中
@InjectMocks
private MyService myService;

@Test
public void testProcessData() {
// 设定模拟对象 dependency 的 getData 方法的返回值
when(dependency.getData()).thenReturn("mocked data");

// 调用 MyService 的 processData 方法
String result = myService.processData();

// 验证结果
assertEquals("Processed: mocked data", result);
}
}
总结
  • @Mock 主要用于创建模拟对象,以此模拟外部依赖的行为。
  • @InjectMocks 用于创建真实对象,并把模拟对象注入到该真实对象中,从而测试其整体逻辑。

Session19-MICROSERVICE

  1. In your own word, please describe some of the advantages and disadvantages of a
  2. Monolithic Application.
  3. In your own word, please describe some of the advantages and disadvantages of a
  4. Microservice Application.
  5. What is the purpose of using Netflix Eureka?
  6. How can microservices communicate with each other?
  7. What is the purpose of using Spring API Gateway?
  8. Explain cascading failure in microservice and how to prevent it.
  9. Explain CircuitBreaker and how it works in detail.

The following is an explanation of each question along with relevant examples:

Monolithic Application

  • Advantages
    • Simplicity: It’s a single unit, easy to develop, test and deploy. For example, a small blog website built with a monolithic architecture can be developed quickly as all the components are in one place.
    • Ease of Data Management: All components can access the same database easily, simplifying data consistency. In a monolithic e-commerce app, the product, order and user data can be managed centrally.
    • Good for Small Projects: Ideal for small-scale applications with low complexity and clear requirements. A simple internal management system for a small company may not need the complexity of a distributed architecture.
  • Disadvantages
    • Scalability Issues: As the application grows, it becomes hard to scale. If a monolithic social media app experiences a sudden traffic spike, scaling the entire application is more difficult and expensive than scaling individual components.
    • Slow Deployment: Any change requires redeploying the entire application. If you want to update a single feature in a monolithic banking app, the whole app needs to be deployed, causing potential downtime.
    • Technology Limitations: It’s hard to adopt new technologies or frameworks in a monolithic structure. For example, if you want to use a new data processing framework in a monolithic app that’s already using an old tech stack, it may require a major rewrite.

Microservice Application

  • Advantages
    • High Scalability: Each microservice can be scaled independently. In a large e-commerce platform like Amazon, the order processing, inventory management and user profile services can be scaled based on their specific load.
    • Technology Diversity: Different microservices can use different technologies based on their requirements. For example, the image processing microservice can use a different technology stack than the user authentication microservice.
    • Faster Deployment: Only the updated microservice needs to be deployed. If a new feature is added to the payment microservice of a fintech app, only that microservice is deployed, minimizing downtime.
  • Disadvantages
    • Complexity in Management: Managing multiple microservices, their communication and dependencies is complex. For example, coordinating data updates across multiple microservices in a healthcare application can be challenging.
    • Data Consistency: Ensuring data consistency across multiple microservices is difficult. In a microservices-based ride-hailing app, maintaining the consistency of driver and rider data across different services can be a problem.
    • Testing Complexity: Testing the entire system becomes more complex as it involves testing multiple microservices and their interactions. Testing a microservices-based logistics app requires testing each service and how they work together.

Netflix Eureka

  • Purpose: It’s a service discovery tool. It allows microservices in a distributed system to register and discover each other. For example, in a microservices architecture where there are multiple user service instances and order service instances, Eureka helps the order service find the available user service instances to communicate with.

Microservices Communication

  • Methods:
    • RESTful API: Microservices can communicate via HTTP requests using RESTful APIs. For example, a product service can expose a REST API that a shopping cart service can call to get product details.
    • Message Queues: They can use message queues like RabbitMQ or Kafka. For instance, in an e-commerce system, when an order is placed, the order service can send a message to a message queue, which the inventory service listens to and updates the inventory accordingly.

Spring API Gateway

  • Purpose: It acts as a single entry point for all microservices. It provides features like request routing, authentication, rate limiting, etc. For example, in a microservices-based application, all external requests first come to the API gateway, which then routes the requests to the appropriate microservices. It can also apply authentication and authorization rules before allowing the request to reach the microservices.

Cascading Failure in Microservice and Prevention

  • Explanation: In a microservices environment, if one microservice fails, it can cause other dependent microservices to fail, leading to a cascading effect. For example, if the user service in a social media app fails, the services that depend on it like the post service (which needs to get user information) and the comment service may also fail.
  • Prevention:
    • Circuit Breaker: Implementing circuit breakers can prevent cascading failures. If a microservice fails to respond after a certain number of attempts, the circuit breaker trips and stops sending requests to that service, preventing other services from waiting indefinitely and potentially failing.
    • Isolation: Using techniques like thread pools and resource isolation to ensure that the failure of one microservice doesn’t exhaust the resources of other services.

Circuit Breaker

  • Explanation: A circuit breaker is a design pattern used to prevent cascading failures in a microservices architecture. It monitors the health of a service and if the service fails to respond or returns errors frequently, the circuit breaker trips and stops sending requests to that service for a certain period.
  • How it Works:
    • Closed State: Initially, the circuit breaker is in the closed state and all requests are sent to the service as normal.
    • Open State: If the service fails a certain number of times within a given time period, the circuit breaker opens. In this state, all requests to the service are immediately failed without being sent to the actual service.
    • Half-Open State: After a certain period in the open state, the circuit breaker enters the half-open state. It allows a small number of requests to be sent to the service to check if it has recovered. If the requests succeed, the circuit breaker closes and normal operation resumes. If the requests fail, the circuit breaker returns to the open state.

Session23-DEVOPS

  1. Use your own words to explain Jenkins.
  2. Can you talk about CI/CD?
  3. Git command you used in the project
  4. How do you release from the git repository
  5. How do you combine several commits together
  6. What is git cherry-pick
  7. difference between git and svn
  8. difference git merge and rebase

    1. Jenkins

    Jenkins is an open - source automation server widely used in software development. Its main purpose is to automate various stages of the software development lifecycle, such as building, testing, and deploying applications.

How it works:
Jenkins has a web - based interface where you can create and configure jobs. A job in Jenkins can represent a specific task, like building a Java project or running a set of unit tests. You can define the steps of the job, including the commands to execute, the source code repositories to pull from, and the environment variables to use.

Example:
Suppose you are developing a Python web application. You can set up a Jenkins job to automatically pull the latest code from a Git repository, install the necessary Python dependencies, run unit tests, and then deploy the application to a staging server if the tests pass.

2. CI/CD

  • CI (Continuous Integration):
    • CI is a development practice where developers frequently integrate their code changes into a shared repository. Every time code is pushed to the repository, an automated build and test process is triggered. This helps to catch integration issues early in the development cycle.
    • Example: In a team of developers working on a mobile app, each developer may push their code changes to the main Git repository several times a day. A CI server (like Jenkins) then automatically builds the app from the latest code and runs unit and integration tests. If any tests fail, the developers are notified immediately.
  • CD (Continuous Delivery/Deployment):
    • Continuous Delivery is an extension of CI. It ensures that the software can be reliably released to production at any time. After the code passes the CI tests, it is automatically prepared for deployment, but the actual deployment to production may be a manual step.
    • Continuous Deployment takes it a step further and automatically deploys the software to production if it passes all the tests.
    • Example: For a web - based e - commerce application, with continuous delivery, once the code passes the CI tests, it is packaged and stored in a deployment artifact repository. A release manager can then decide when to deploy it to the production servers. In continuous deployment, the application is automatically deployed to production as soon as the tests pass.

3. Git commands used in a project

  • git clone: Used to create a local copy of a remote Git repository.
    • Example: git clone https://github.com/user/repo.git creates a local copy of the repo repository hosted on GitHub.
  • git add: Adds changes in the working directory to the staging area.
    • Example: git add src/main.py adds the changes made to the main.py file to the staging area.
  • git commit: Commits the changes from the staging area to the local repository with a descriptive message.
    • Example: git commit -m "Fixed a bug in the login function"
  • git push: Pushes the committed changes from the local repository to a remote repository.
    • Example: git push origin main pushes the changes from the local main branch to the main branch of the remote repository named origin.
  • git pull: Fetches and merges changes from a remote repository into the local repository.
    • Example: git pull origin main fetches the latest changes from the main branch of the origin remote repository and merges them into the local main branch.

4. Releasing from the Git repository

  • Create a Release Branch (Optional):
    • You can create a dedicated release branch from the main development branch (e.g., main or master). For example, git checkout -b release/v1.0 main creates a new release branch named release/v1.0 from the main branch.
  • Tag the Release:
    • Use the git tag command to mark a specific commit as a release. For example, git tag v1.0 tags the current commit as version v1.0. You can then push the tags to the remote repository using git push origin --tags.
  • Build and Deploy:
    • Use the tagged commit to build the application and deploy it to the appropriate environments (staging, production, etc.).

5. Combining several commits together

You can use the git rebase -i (interactive rebase) command to combine multiple commits.

  • Example: Suppose you have made 3 consecutive commits and want to combine them into one.
    • First, find the commit hash of the commit before the first commit you want to combine. Let’s say the commit hash is abc123.
    • Then run git rebase -i abc123. This will open an editor where you can see a list of commits.
    • Change the pick keyword to squash (or s) for the commits you want to combine with the previous one.
    • Save and close the editor. Git will then combine the commits, and you can provide a new commit message for the combined commit.

6. Git cherry - pick

git cherry - pick is used to apply a specific commit from one branch to another.

  • Example: Suppose you have a feature branch with a commit that you want to apply to the main branch.
    • First, switch to the main branch: git checkout main.
    • Then use git cherry - pick <commit - hash> where <commit - hash> is the hash of the commit on the feature branch that you want to apply. Git will then try to apply that commit to the main branch.

7. Difference between Git and SVN

  • Architecture:
    • Git: It is a distributed version control system. Every developer has a complete copy of the repository, including the entire commit history. This allows developers to work offline and perform most operations locally.
    • SVN: It is a centralized version control system. There is a single central repository, and developers need to connect to it to perform operations like committing changes or getting the latest code.
  • Branching and Merging:
    • Git: Branching and merging are very fast and easy. Creating a new branch is just a matter of creating a new pointer to a commit. Merging between branches is also efficient.
    • SVN: Branching and merging can be more complex and slower. It involves copying the entire directory structure in the repository to create a branch.
  • Data Integrity:
    • Git: It uses a hash - based system to ensure data integrity. Every commit, file, and directory has a unique hash, and any change to the data will result in a different hash.
    • SVN: While it also has some mechanisms for data integrity, it is not as robust as Git’s hash - based system.

8. Difference between Git merge and rebase

  • Merge:
    • A git merge combines the changes from two or more branches into one. It creates a new “merge commit” that has two parents, one from each branch being merged.
    • Example: If you have a feature branch and a main branch, and you want to integrate the changes from the feature branch into the main branch, you can run git checkout main followed by git merge feature. This will create a merge commit on the main branch.
    • The commit history after a merge shows a more complex, branching structure.
  • Rebase:
    • A git rebase moves or combines a sequence of commits to a new base commit. It takes the commits from one branch and replays them on top of another branch.
    • Example: If you have a feature branch and a main branch, and you want to update the feature branch with the latest changes from the main branch, you can run git checkout feature followed by git rebase main. This will take the commits from the feature branch and replay them on top of the latest commit on the main branch.
    • The commit history after a rebase is linear, which can make it easier to understand and follow. However, it can also be more complex to resolve conflicts during a rebase compared to a merge.

Session25-CLOUD

  1. AWS difference between parameter store and secret manager
  2. AWS where to store certificate file
  3. extra:(those we are not sure which session to put in)
  4. Use your own words to explain TDD and why use TDD.
  5. Please do some research on Redis and use your own words to explain what Redis is.
  6. Use your own words to explain what Swagger is.
  7. Please do some research on ELK and use your own words to explain what they are.
  8. Use your own words to explain Jira.
  9. What is RabbitMQ and what can it help us to achieve in a web application?What are the component of RabbitMQ?
  10. What are different types of Exchange that exist in RabbitMQ?
  11. What is Scheduler and what can it help us to achieve in a web application?

1. AWS: Difference between Parameter Store and Secret Manager

Parameter Store

  • Explanation: AWS Systems Manager Parameter Store is a service that allows you to store configuration data such as database connection strings, API keys, and other parameters in a hierarchical structure. It’s designed for storing both plain - text and encrypted data. It helps in centralizing configuration management, making it easier to manage and update application settings across multiple environments.
  • Example: Suppose you have a microservices - based application deployed in multiple AWS regions. You can store the database connection strings for each region in the Parameter Store. For instance, a key - value pair like /myapp/production/db - connection - string with the actual connection string as the value. When your application starts, it can retrieve the appropriate connection string from the Parameter Store based on the environment.

Secret Manager

  • Explanation: AWS Secrets Manager is focused on securely managing secrets such as passwords, access keys, and other sensitive information. It provides features like automatic rotation of secrets, auditing, and fine - grained access control. It’s designed to reduce the risk of exposing sensitive data and simplify the process of keeping secrets up - to - date.
  • Example: Consider an application that uses an Amazon RDS database. You can store the database password in the Secrets Manager. The application can then retrieve the password securely when it needs to connect to the database. Additionally, you can set up automatic rotation of the password every 30 days, which helps in maintaining security.

2. AWS: Where to store certificate files

  • AWS Certificate Manager (ACM):
    • ACM is the recommended service for managing SSL/TLS certificates in AWS. It provides free SSL/TLS certificates for use with AWS services such as Elastic Load Balancing, Amazon CloudFront, and API Gateway. You can easily request, renew, and manage certificates through the ACM console or API.
    • Example: If you have a web application running behind an Elastic Load Balancer, you can request an SSL/TLS certificate from ACM and associate it with the load balancer. This enables secure communication between clients and your application.
  • AWS S3:
    • You can also store certificate files in an Amazon S3 bucket. However, you need to ensure proper security measures such as encryption and access control. This option is useful if you need to use the certificates with non - AWS services or if you want to have more control over the storage and management of the certificates.
    • Example: If you have an on - premise server that needs to use an SSL/TLS certificate stored in AWS, you can store the certificate in an S3 bucket and download it to the server when needed.

3. TDD (Test - Driven Development)

  • Explanation: TDD is a software development process where you write tests before writing the actual production code. The process typically follows a cycle of “Red - Green - Refactor”. First, you write a test that initially fails (Red). Then, you write the minimum amount of code to make the test pass (Green). Finally, you refactor the code to improve its design, readability, and maintainability without changing its behavior.
  • Why use TDD:
    • Early Bug Detection: By writing tests first, you can catch bugs early in the development process, reducing the cost of fixing them later.
    • Improved Design: TDD encourages writing modular and testable code, which leads to better software design.
    • Documentation: Tests serve as living documentation for the code, making it easier for other developers to understand how the code works.
  • Example: Suppose you are developing a simple calculator class with an add method. First, you write a test like this in Java using JUnit:
    import org.junit.jupiter.api.Test;
    import static org.junit.jupiter.api.Assertions.assertEquals;

    public class CalculatorTest {
    @Test
    public void testAdd() {
    Calculator calculator = new Calculator();
    int result = calculator.add(2, 3);
    assertEquals(5, result);
    }
    }

This test will initially fail because the Calculator class and the add method don’t exist yet. Then you write the minimum code to make the test pass:

public class Calculator {
public int add(int a, int b) {
return a + b;
}
}

Finally, you can refactor the code if needed, for example, by adding more error handling or improving the code style.

4. Redis

  • Explanation: Redis is an open - source, in - memory data structure store that can be used as a database, cache, and message broker. It supports various data structures such as strings, hashes, lists, sets, and sorted sets. Redis is known for its high performance because it stores data in memory, which allows for very fast read and write operations. It also provides features like data persistence, replication, and clustering.
  • Example: In a web application, Redis can be used as a cache to store frequently accessed data. For example, if you have a news website, you can store the top - viewed articles in Redis. When a user requests the top - viewed articles page, the application first checks Redis. If the data is available in Redis, it can be returned immediately, reducing the load on the database.

5. Swagger

  • Explanation: Swagger is a set of open - source tools and frameworks for designing, building, documenting, and consuming RESTful APIs. It provides a way to describe the structure of an API using a JSON or YAML - based specification. Swagger tools can then generate interactive documentation, client libraries, and server stubs based on the API specification.
  • Example: Suppose you have a RESTful API for a book management system. You can use Swagger to define the API endpoints, request and response formats, and available operations. Swagger UI can then generate an interactive documentation page where developers can explore the API, test the endpoints, and see the expected input and output formats.

6. ELK (Elasticsearch, Logstash, Kibana)

  • Explanation:
    • Elasticsearch: It is a distributed, open - source search and analytics engine. It stores data in a JSON - like format and allows for fast and flexible searching, filtering, and aggregation of data. It can handle large volumes of data and is often used for log analysis, full - text search, and real - time analytics.
    • Logstash: Logstash is a data processing pipeline that collects, filters, and transforms data from various sources (such as log files, system metrics, and application events) and sends it to a destination (such as Elasticsearch). It can perform tasks like parsing log data, enriching it with additional information, and cleaning it up.
    • Kibana: Kibana is a web - based visualization tool that works with Elasticsearch. It allows users to create visualizations, dashboards, and reports based on the data stored in Elasticsearch. It provides an intuitive interface for exploring and analyzing data.
  • Example: In a large - scale web application, Logstash can collect all the application logs from different servers. It can then parse the logs, extract relevant information such as request URLs, response times, and error messages. The processed data is sent to Elasticsearch for storage. Developers and administrators can then use Kibana to create dashboards showing the application’s performance metrics, error rates, and other important information.

7. Jira

  • Explanation: Jira is a popular project management and issue - tracking tool developed by Atlassian. It allows teams to plan, track, and manage projects, tasks, and bugs. Jira provides features such as customizable workflows, issue tracking, reporting, and integration with other tools. It can be used for software development projects, but also for other types of projects in different industries.
  • Example: In a software development team, Jira can be used to manage the development lifecycle of a project. Developers can create issues for new features, bugs, and tasks. The project manager can assign these issues to team members, set deadlines, and track the progress of each issue. Jira also provides reports on the project’s status, such as the number of open and closed issues, the time taken to resolve issues, and the overall project progress.

8. RabbitMQ

  • Explanation: RabbitMQ is an open - source message broker software that implements the Advanced Message Queuing Protocol (AMQP). It enables applications to communicate with each other by sending and receiving messages. It acts as an intermediary between producers (applications that send messages) and consumers (applications that receive messages).
  • What it can help achieve in a web application:
    • Decoupling: It allows different components of a web application to be decoupled. For example, in an e - commerce application, the order processing component can send messages to the inventory management component through RabbitMQ without having direct knowledge of the inventory system.
    • Asynchronous Processing: It enables asynchronous processing, which can improve the performance and scalability of the application. For instance, when a user submits a form, the application can send a message to RabbitMQ and continue processing other tasks without waiting for the form data to be fully processed.
  • Components of RabbitMQ:
    • Producer: An application that sends messages to a RabbitMQ broker.
    • Consumer: An application that receives messages from a RabbitMQ broker.
    • Queue: A buffer that stores messages until they are consumed.
    • Exchange: Routes messages to one or more queues based on rules.
    • Broker: The RabbitMQ server that manages the queues, exchanges, and message routing.

9. Different types of Exchanges in RabbitMQ

  • Direct Exchange: Routes messages to queues based on the message’s routing key. Each queue is bound to the direct exchange with a specific routing key. When a message is sent to the direct exchange with a certain routing key, it is delivered to the queues that are bound with the same routing key.
    • Example: In a logging application, different types of logs (e.g., error logs, warning logs) can be sent to different queues using a direct exchange.
  • Fanout Exchange: Routes messages to all the queues that are bound to it, regardless of the routing key. It is useful when you want to broadcast messages to multiple consumers.
    • Example: In a news application, when a new news article is published, a message can be sent to a fanout exchange, and all the queues (e.g., queues for different user groups) bound to the exchange will receive the message.
  • Topic Exchange: Routes messages to queues based on a pattern matching of the routing key. Queues are bound to the topic exchange with a binding key that can contain wildcards (* for single - word matching and # for multi - word matching).
    • Example: In a financial application, messages about different stocks can be sent to a topic exchange. A queue can be bound to the exchange with a binding key like stocks.# to receive all messages related to stocks.
  • Headers Exchange: Routes messages based on the message headers rather than the routing key. Queues are bound to the headers exchange with a set of header values. When a message is sent with specific headers, it is delivered to the queues that match the header values.

10. Scheduler

  • Explanation: A scheduler in a web application is a component that allows you to schedule tasks to run at specific times or intervals. It can be used to perform various tasks such as running batch jobs, sending periodic notifications, and refreshing caches.
  • What it can help achieve in a web application:
    • Automation: It automates repetitive tasks, reducing the need for manual intervention. For example, a scheduler can be used to automatically generate daily reports in a business application.
    • Resource Optimization: It can be used to schedule resource - intensive tasks during off - peak hours to optimize the use of system resources. For instance, a scheduler can be used to perform database backups at night when the application has low traffic.
  • Example: In a content management system, a scheduler can be used to publish new articles at a specific time. The administrator can set a publication time for an article, and the scheduler will ensure that the article is made available to the public at the specified time.