Skip to main content

Posts

Git Flow Context

Git Flow Context In the context of Git Flow, a popular branching model for managing Git branches, the roles of tags and branches are clearly defined: Branches: Main Branches: main (or master) : The main branch where the source code of HEAD always reflects a production-ready state. develop : The branch where the latest development happens. This branch contains the complete history of the project, whereas the main branch contains an abridged version. Supporting Branches: Feature Branches: Used to develop new features. Typically created off the develop branch and merged back into develop. git switch -c feature/<feature-name> develop Release Branches: Used to prepare a new production release. Created from develop and merged into both main and develop. git switch -c release/<version> develop Hotfix Branches: Used to quickly fix production issues. Created from main and merged back into both main and develop. git switch -c hotfix/<description> main Tags: Release Tags : U...

Validating Native Query and Entity Graph

 Validating Native Query and Entity Graph Using native queries in Spring JPA/Hibernate can be less enjoyable, but they are occasionally indispensable, especially for enhancing performance. Similarly, EntryGraphs become necessary in specific situations to mitigate N+1 scenarios or define data boundaries retrieved from the database. Since it's challenging to validate these scenarios without executing each query, inadequate testing can lead to significant issues. The same holds true when refactoring numerous database tables and columns, potentially missing updates in native queries, which may slip through testing and manifest in production. You may encounter such scenarios as well. Therefore, we've invested effort into developing a solution to validate native queries against a schema. You might be interested in such thing or suggest some improvement over it. Here's what we've accomplished: Developed a component that conducts validation by executing each native query agains...

Shell Scripts

Shell scripts $? variable: In a shell script, we can check the return status immediately after running any command to determine if command is successful or not. like echo $? if return status is 0, it indicates success,  and if the return status is non-zero, typically 1, means failure. /dev/null /dev/null is a special device file in Unix-like operating systems (including Linux) that discards all data written to it. It essentially acts as a black hole for data. When data is written to /dev/null, it simply disappears and does not consume any storage space. Here are some common use cases for /dev/null: Discarding Output: As mentioned earlier, redirecting output to /dev/null is a common way to discard unwanted output, such as diagnostic messages or verbose output, especially when running scripts or commands in the background where you don't need to see the output. command >/dev/null  # Redirects stdout to /dev/null command 2>/dev/null # Redirects stderr to /dev/null command ...

HTML to PDF

Converting HTML to PDF with code offers more control and flexibility compared to online tools. Here are some ways to achieve this: 1. Using Python Libraries: Python provides several libraries for HTML to PDF conversion. Here are two popular options: WeasyPrint (using wkhtmltopdf): This library utilizes the powerful wkhtmltopdf tool for rendering HTML and generating PDFs. It offers fine-grained control over the conversion process. Here's an example using WeasyPrint: Python from weasyprint import HTML html_file = "my_report.html" # Replace with your HTML file path pdf_file = "report.pdf" HTML(filename=html_file).write_pdf(pdf_file) print("Converted HTML to PDF successfully!") Use code with caution. PDFKit (using wkhtmltopdf): Similar to WeasyPrint, PDFKit leverages wkhtmltopdf. It offers a simpler API for basic conversions. Here's an example using PDFKit: Python import pdfkit url = "https://www.example.com" # Replace with a...

Recover lost files on Windows, free and effective

 Windows File Recovery If necessary, download and launch the app from Microsoft Store. Press the Windows key, enter Windows File Recovery in the search box, and then select Windows File Recovery. When you are prompted to allow the app to make changes to your device, select Yes. In the Command Prompt window, enter the command in the following format:  winfr source-drive: destination-drive: [/mode] [/switches] There are 2 basic modes you can use to recover files: Regular and Extensive.  Regular mode examples Recover your Documents folder from your C: drive to the recovery folder on an E: drive. Don’t forget the backslash (\) at the end of the folder.   winfr C: E: /regular /n \Users\<username>\Documents\  Recover PDF and Word files from your C: drive to the recovery folder on an E: drive.  winfr C: E: /regular /n *.pdf /n *.docx  Extensive mode examples   winfr E: C: /extensive /n *invoice*  Recover jpeg and png photos from your...

Archiving all messages on LinkedIn

Open https://www.linkedin.com/ in a web browser, got to developer option and then paste the script in Console window and clink enter. timer = setInterval(() => { // select all messages items = document.querySelectorAll('div.msg-selectable-entity__checkbox-container > input'); for (let i = 0; i < items.length; i++) { items[i].click(); } setTimeout(() => { // click archive button buttons = document.querySelectorAll('div.display-flex.mvA > button[title="Archive"]'); if (buttons.length == 1) { buttons[0].click(); } }, 1000); }, 5000) Credit to Gaidar Magdanurov

Git: create a patch of the last two commits

 To merge two patch files into a single patch file or create a patch of the last two commits, you can use the `git diff` and `git apply` commands. Here are the steps for each scenario: ### Merge Two Patch Files into a Single Patch File: 1. Suppose you have two patch files named `patch1.patch` and `patch2.patch`. 2. To merge them into a single patch file, you can use the `cat` command:    ```bash    cat patch1.patch patch2.patch > combined.patch    ```    This command concatenates the contents of both patch files into a new file named `combined.patch`. ### Create a Patch of the Last Two Commits: 1. Generate a patch file for the last two commits using the `git diff` command:    ```bash    git diff HEAD~2..HEAD > last_two_commits.patch    ```    This command creates a patch file (`last_two_commits.patch`) that represents the changes introduced in the last two commits. 2. Apply the generated patch fil...

Find files that contain the word 'xxx' but do not contain the word 'yyy' in the same file

  Find files that contain the word 'xxx' but do not contain the word 'yyy' in the same file ```bash grep -rl 'xxx' /path/to/search/* | xargs grep -L 'yyy' ``` This command first searches for files containing 'xxx' using the `-r` (recursive) and `-l` (only file names) options. Then, it pipes the results to `xargs` to search for files that do not contain 'yyy' using the `-L` option. Make sure to replace `/path/to/search/*` with the actual path or file pattern you want to search.

Jackson vs Gson

Jackson vs Gson Choosing between Jackson and GSON depends on your specific needs and priorities. Both are excellent libraries, but they excel in different areas: Jackson: Strengths:     Performance: Generally outperforms GSON, especially for large and complex data sets and when using streaming APIs or annotations.     Flexibility: Offers extensive annotation support for customization, including support for inheritance and advanced features like "mix-in" annotations.     Advanced features: Provides a streaming API for incremental processing, tree model access, and support for data binding with other formats like XML. Weaknesses:     Steeper learning curve: Requires more knowledge of JSON processing mechanisms compared to GSON.     Complexity : Can be more complex to work with for simple tasks due to its rich feature set. GSON: Strengths:     Simplicity : Easier to learn and use, especially for basic JSON parsing and generati...

JavaHiddenGems

Johanjanssen JavaHiddenGems Make sure to start the Docker-webserver-cache container before running the OWASP dependency check or the Old GroupIds Alerter.  Github Examples Apache PDFBox  Create and change PDF files or extract content from PDF files https://pdfbox.apache.org/ Apache POI  Create, change and read files based on the Office Open XML standards (OOXML) such as Word and Excel files. https://poi.apache.org/ ArchUnit Verify the Java code's architecture with unit tests. https://www.archunit.org/ AssertJ Test code with assertions. https://assertj.github.io/doc/ AutoService Generator for ServiceLoader service providers. https://github.com/google/auto AutoValue Generate immutable value classes. https://github.com/google/auto Awaitility Test asynchronous applications with a DSL. https://github.com/awaitility/awaitility Buildpacks Create (Docker) images. https://buildpacks.io/ ClassGraph Classpath and module scanner for Java and other JVM languages. https://github.com/c...

Apache Spark main components

 Apache Spark has several main components that work together to enable distributed data processing. Here are the key components of Apache Spark: 1. **Driver Program:**    - The driver program is the main program that controls the execution of a Spark application. It defines the high-level control flow, creates SparkContext, and coordinates the distribution of tasks across the cluster. 2. **SparkContext:**    - SparkContext is the entry point for any Spark functionality. It coordinates the execution of Spark jobs and manages the distribution of tasks across the worker nodes. The driver program communicates with SparkContext to execute operations on the Spark cluster. 3. **Cluster Manager:**    - Spark supports various cluster managers for resource management, including Apache Mesos, Apache Hadoop YARN, and standalone mode. The cluster manager allocates resources and schedules tasks across worker nodes in the cluster. 4. **Executor:**    - Exec...

Apache Storm vs Apache Spark

 Apache Spark and Apache Storm are both distributed data processing frameworks, but they are designed for different use cases and have different characteristics. Here's a comparison between Apache Spark and Apache Storm: 1. **Use Cases:**    - **Apache Spark:** Spark is a general-purpose, fast, and in-memory data processing engine that supports both batch and stream processing. It is suitable for a wide range of applications, including large-scale data processing, machine learning, graph processing, and interactive queries.        - **Apache Storm:** Storm is specifically designed for real-time stream processing. It excels at processing data in motion, making it suitable for applications that require low-latency and real-time analytics. Typical use cases include fraud detection, monitoring, and alerting systems. 2. **Processing Model:**    - **Apache Spark:** Spark provides a higher-level API for both batch and stream processing. It uses a fu...

What is Apache Spark

 Apache Spark is an open-source distributed computing system that provides a fast and general-purpose cluster-computing framework for big data processing. It was developed to overcome the limitations of the MapReduce model and is designed to be faster, more flexible, and more accessible for a wide range of data processing tasks. Key features of Apache Spark include: 1. **Speed:**    - Spark is known for its in-memory processing capabilities, which allow it to perform iterative algorithms and interactive data analysis much faster than traditional disk-based systems like Hadoop MapReduce. This is achieved by caching intermediate data in memory between stages of computation. 2. **Ease of Use:**    - Spark provides high-level APIs in Java, Scala, Python, and R, making it accessible to a broad audience of developers and data scientists. It offers a more user-friendly programming model compared to the lower-level MapReduce paradigm. 3. **Versatility:**    - ...

Apache Flink main components

 Apache Flink is a powerful distributed data processing framework with a variety of components that work together to process and analyze large-scale data. Here are the main components of Apache Flink: 1. **JobManager:**    - The JobManager is the master daemon in a Flink cluster. It is responsible for accepting job submissions, coordinating and scheduling tasks across the TaskManagers, and managing the overall execution of Flink jobs. 2. **TaskManager:**    - TaskManagers are worker nodes in the Flink cluster. They are responsible for executing tasks, which are the individual units of work in a Flink job. TaskManagers are assigned tasks by the JobManager and run them concurrently to achieve parallel processing. 3. **Job:**    - A job in Flink represents the entire data processing application. It consists of a directed acyclic graph (DAG) of operators and defines the flow of data from sources (such as Kafka or HDFS) through various transformations to si...

Apache Storm main components

 Apache Storm has several main components that work together to enable distributed real-time stream processing. Here are the key components of Apache Storm: 1. **Nimbus:**    - Nimbus is the master node in a Storm cluster. It is responsible for distributing code around the cluster, assigning tasks to worker nodes, and monitoring the overall health of the cluster. Nimbus also manages the assignment of spouts and bolts in the topology. 2. **Supervisor:**    - Supervisors run on worker nodes in the Storm cluster. They are responsible for starting and stopping worker processes (called executors) based on the assignments received from Nimbus. Supervisors monitor the health and resource usage of worker processes and report back to Nimbus. 3. **Worker:**    - A worker is a process running on a worker node that executes a subset of a topology. Each worker runs one or more executor threads, and each thread can run one or more tasks. Tasks correspond to individu...

What is Apache Flink

 Apache Flink is an open-source stream processing and batch processing framework for big data processing and analytics. It is designed to efficiently process large volumes of data in real-time and batch processing modes, making it suitable for a wide range of data processing applications. Flink provides a unified runtime for both batch and stream processing, enabling developers to build complex data processing applications with ease. Key features of Apache Flink include: 1. **Unified Processing Model:**    - Flink offers a unified processing model for both batch and stream processing. This allows developers to use the same API and programming model for both types of data processing, simplifying the development and maintenance of applications. 2. **Event Time Processing:**    - Flink has built-in support for event time processing, allowing developers to handle and analyze data with respect to the timestamps assigned to events. This is crucial for handling out-of-...

Apache Storm vs Apache Spark

 Apache Storm and Apache Spark are both distributed data processing frameworks, but they are designed for different use cases and have different characteristics. Here's a comparison between Apache Storm and Apache Spark: 1. **Use Cases:**    - **Apache Storm:** Storm is specifically designed for real-time stream processing. It excels at processing data in motion, making it suitable for applications that require low-latency and real-time analytics. Typical use cases include fraud detection, monitoring, and alerting systems.        - **Apache Spark:** Spark is a general-purpose data processing framework that supports both batch and stream processing. While it has a streaming module called Spark Streaming, it is not as optimized for low-latency processing as Storm. Spark is often used for large-scale batch processing, machine learning, graph processing, and interactive queries. 2. **Programming Model:**    - **Apache Storm:** Storm provides a lo...

Apache Storm vs Apache Flink

 Apache Storm and Apache Flink are both distributed stream processing frameworks, but they have some key differences in terms of architecture, programming models, and features. Here's a comparison between Apache Storm and Apache Flink: 1. **Programming Model:**    - **Apache Storm:** Storm provides a low-level, event-driven programming model using spouts and bolts. Spouts are sources of data, and bolts are the processing units that apply transformations or analyses to the data. It is designed for building complex, directed acyclic graphs (DAGs) of processing stages.        - **Apache Flink:** Flink offers a more high-level and expressive API for stream processing. Flink's API includes a functional programming style using operations like map, flatMap, filter, and windowing operations, making it easier to express complex data transformations. 2. **Event Time Processing:**    - **Apache Storm:** Initially, Storm had challenges in handling event ...

Alternative of Apache Storm

 There are several alternatives to Apache Storm for real-time stream processing, each with its own strengths and use cases. Here are some notable alternatives: 1. **Apache Flink:**    - Apache Flink is a powerful open-source stream processing framework that supports both batch and stream processing. It provides event time processing, exactly-once semantics, and a rich set of APIs for building complex data processing applications. 2. **Apache Samza:**    - Developed by LinkedIn and later open-sourced as part of the Apache Software Foundation, Apache Samza is a stream processing framework that focuses on simplicity and fault tolerance. It seamlessly integrates with Apache Kafka and is designed for high-throughput, low-latency processing. 3. **Spark Streaming (Structured Streaming):**    - Apache Spark, a popular big data processing framework, includes a streaming module called Spark Streaming. In more recent versions, Structured Streaming has been introd...

Apache Storm vs Apache Kafka

 Apache Storm and Apache Kafka serve different purposes in the context of real-time data processing. **Apache Storm:** 1. **Processing Engine:** Storm is a distributed real-time stream processing engine. It is designed for processing and analyzing data in motion, as it flows through the system.    2. **Data Transformation:** Storm allows you to define complex data processing topologies using spouts and bolts. Spouts are sources of data, and bolts are the processing units that apply transformations or analyses to the data. 3. **Low-Latency Processing:** Storm is optimized for low-latency processing, making it suitable for use cases where real-time or near-real-time processing of streaming data is essential. 4. **Stateful Processing:** Storm supports stateful processing, allowing components in the topology to maintain state information across processing instances. **Apache Kafka:** 1. **Distributed Streaming Platform:** Kafka, on the other hand, is a distributed streaming p...