Promtail Explained: Turning Logs into Metrics for Prometheus and Loki

Promtail Explained: Turning Logs into Metrics for Prometheus and Loki

Promtail is the solution when you need to provide metrics that are only present on the log traces of the software you need to monitor to provide a consistent monitoring platform

It is a common understanding that three pillars in the observability world help us to get a complete view of the status of our own platforms and systems: Logs, Traces, and Metrics.

To provide a summary of the differences between each of them:

  • Metrics are the counters about the state of the different components from both a technical and a business view. So we can see here things like the CPU consumption, the number of requests, memory, or disk usage…
  • Logs are the different messages that each of the pieces of software in our platform provides to understand its current behavior and detect some non-expected situations.
  • Trace is the different data regarding the end-to-end request flow across the platform with the services and systems that have been part of that flow and data related to that concrete request.

We have solutions that claim to address all of them, mainly in the enterprise software with Dynatrace, AppDynamics, and similar. And on the other hand, we try to go with a specific solution for each of them that we can easily integrate together and we have discussed a lot about that options in previous articles.

But, some situations in that software don’t work following this path because we live in the most heterogeneous era. We all embrace, at some level, the polyglot approach on the new platforms. In some cases, we can see that software is using log traces to provide data related to metrics or other matters, and here is when we need to rely on pieces of software that help us “fix” that situation, and Promtail does specifically that.

Promtail is mainly a log forwarder similar to others like fluentd or fluent-bit from CNCF or logstash from the ELK stack. In this case, this is the solution from Grafana Labs, and as you can imagine, this is part of the Grafana stack with Loki to be the “master-mind” that we cover in this article that I recommend you to take a look at if you haven’t read it yet:

Promtail has two main ways of behaving as part of this architecture, and the first one is very similar to others in this space, as we commented before. It helps us ship our log traces from our containers to the central location that will mainly be Loki and can be a different one and provide the usual options to play and transform those traces as we can do in other solutions. You can look at all the options in the link below, but as you can imagine, this includes transformation, filtering, parsing, and so on.

But what makes promtail so different is just one of the actions that you can do, and that action is metrics. Metrics provides a specific way to, based on the data that we are reading from the logs, create Prometheus metrics that a Prometheus server can scrape. That means that you can use the log traces that you are processing that can something like this:

[2021–06–06 22:02.12] New request received for customer_id: 123
[2021–06–06 22:02.12] New request received for customer_id: 191
[2021–06–06 22:02.12] New request received for customer_id: 522

With this information apart to send those metrics to the central location to create a metric call, for example: `total_request_count` that will be generated by the promtail agent and also exposed by it and being able also to use a metrics approach even for systems or components that don’t provide a standard way to do that like a formal metrics API.

And the way to do this is very well integrated with the configuration. This is done with an additional stage (this is how we call the actions we can do in Promtail) that is namedmetrics.

The schema of that metric stage is straightforward, and if you are familiar with Prometheus, you will see how direct it is from a definition of Prometheus metrics to this snippet:

# A map where the key is the name of the metric and the value is a specific
# metric type.
metrics:
  [<string>: [ <metric_counter> | <metric_gauge> | <metric_histogram> ] ...]

So we start defining the kind of metrics that we would like to define, and we have the usual ones: counter, gauge, or histogram, and for each of them, we have a set of options to be able to declare our metrics as you can see here for a Counter Metrics

# The metric type. Must be Counter.
type: Counter

# Describes the metric.

[description: <string>]

# Defines custom prefix name for the metric. If undefined, default name “promtail_custom_” will be prefixed.

[prefix: <string>]

# Key from the extracted data map to use for the metric, # defaulting to the metric’s name if not present.

[source: <string>]

# Label values on metrics are dynamic which can cause exported metrics # to go stale (for example when a stream stops receiving logs). # To prevent unbounded growth of the /metrics endpoint any metrics which # have not been updated within this time will be removed. # Must be greater than or equal to ‘1s’, if undefined default is ‘5m’

[max_idle_duration: <string>]

config: # If present and true all log lines will be counted without # attempting to match the source to the extract map. # It is an error to specify `match_all: true` and also specify a `value`

[match_all: <bool>]

# If present and true all log line bytes will be counted. # It is an error to specify `count_entry_bytes: true` without specifying `match_all: true` # It is an error to specify `count_entry_bytes: true` without specifying `action: add`

[count_entry_bytes: <bool>]

# Filters down source data and only changes the metric # if the targeted value exactly matches the provided string. # If not present, all data will match.

[value: <string>]

# Must be either “inc” or “add” (case insensitive). If # inc is chosen, the metric value will increase by 1 for each # log line received that passed the filter. If add is chosen, # the extracted value most be convertible to a positive float # and its value will be added to the metric. action: <string>

And with that, you will have your metric created and exposed, just waiting for a Prometheus server to scrape it. If you would like to see all the options available, all this documentation is available in the Grafana Labs documentation that you can check in the link:

I hope you will find this interesting and a useful way to keep all your observability information managed correctly using the right solution and provide a solution for these pieces of software that don’t follow your paradigm.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Maid: The Ultimate Open-Source Automated File Organizer for Hackers

Maid: The Ultimate Open-Source Automated File Organizer for Hackers

Maid provides the best of both worlds: Ease of use and flexibility that comes with a code-based interface.

One of the main problems that I have because of my routine practices using computers is the lack of organizational discipline. To me is quite complex to have all the different files organized in the right place all the time, and I finally decided to stop fighting against it.

This is something that had happened to me since the beginning of my use of computers but got worse when I started working in the industry more than ten years ago.

I have colleagues with a solid organization process from mail received (using the well-known Louts or Outlook folders) to the documents they received or produced for the different topics and accounts using specific folder organization.

To me, everything is dropped in some folder, such as Downloads, Desktop, or a similar folder, and I hope to find everything around there. For emails, I have solved this problem because the search capabilities from Gmail are so powerful that they can recover any email in seconds, but finding and arranging my files is much more complex.

Because of that, I have been looking for a tool to do all this work for me because I know I cannot do it. I can try it… I can do it one day or maybe two, but this is a habit I cannot hold for a long period of time.

So, in this search of options, I have tried a lot of things because, in this time, I also have changed a lot from operating systems, so I managed to try Hazel, FileJuggle, Drop, and so on. But no one of them provided the flexibility and the simplicity that I needed at the same time. And over and over I come back to my old friend maid

Maid is a project created by Ben Oakes that he defines in its own words like a Hazel for Hackers, and that’s true. So it provides the same capabilities as Hazel but in a way that is much more flexible for people that have a programming background.

Maid: The Ultimate Open-Source Automated File Organizer for Hackers
Sample rule from maid to move pdf files from Downloads to the Books folder

Based on Ruby On Rails, in the end, your duty is as easy as defining the rules you want to apply using this language. But to make it easier and because it is not needed to be a Ruby expert to be able to use the tools, Ben has already developed a lot of helper functions to simplify the creation of these rules.

Helper functions like weeks.accessed, downloaded_from, image_px, and at the same time the common actions like move, trash, or copy.

So, in the end, it is like you code your rules using a very high-level language, like when you are using a GUI in other programs like Hazel. Still, at the same time as this is code, you also have all the flexibility and power at your disposal to be used when you need it.

To install the tools is as easy as just typing the following commands:

gem install maid
maid sample 

And after that, you will have a sample rules file in your .maid folder to help you create the initial set of rules. And after that point, you are only limited by your needs and your imagination to finally handle this file madness that, at least, myself, I have been for a long time into.

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes

Provide more agility to your troubleshooting efforts by debugging exactly where the error is happening using Remote Debugging techniques

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes
Photo by Markus Winkler on Unsplash

Container revolution has provided a lot of benefits, as we have been discussed in-depth in other articles, and at the same time, it has also provided some new challenges that we need to tackle.

This article is part of my comprehensive TIBCO Integration Platform Guide where you can find more patterns and best practices for TIBCO integration platforms.

All agility that we have now in the hands of our developers needs to be also extended to the maintenance work and fixing things. We need to be agile as well. We know some of the main issues that we have regarding this: It works on my environment, with the data set that I have, I couldn’t see the issue, or I couldn’t reproduce the error, are just sentences that we listen to over and over and delay the resolution of some errors or improvements even when the solution is simple we struggle in getting a real scenario to test.

And here is where Remote Debugging comes in. Remote Debugging is, just as its own name clearly states, to be able to debug something that is not local that is remote. It has been focused since its conception in Mobile Development because it doesn’t matter how good the simulator is. You will always need to test in a real device to make sure everything is working properly.

So this is the same concept but applicable to a container, so that means that I have a TIBCO BusinessWorks application running on Kubernetes. We want to debug it as it has been running locally, as shown in the image before. To be able to do that, we need to follow these steps:

Enabling the Remote Debugging in the pod

The first step is to enable the remote debug option in the application and to do that, we need to use the internal API that the BusinessWorks provides, and to do that, we need to execute from inside the container:

curl -XPOST http://localhost:8090/bw/bwengine.json/debug/?interface=0.0.0.0&port=5554&engineName=Main

In case that we do not have any tool like curl or wget to hit a URL inside the container, you can always use the port-forward strategy to make the 8090 port from the pod accessible to enable the debug port using a command similar to the one below:

kubectl port-forward hello-world-test-78b6f9b4b-25hss 8090:8090

And then, we can hit it from our local machine to enable remote debugging

Make the Debug Port accessible to the Studio

To do the remote debugging, we need to be able to connect our local TIBCO BusinessStudio to this specific pod that is executing the load and, to do that, we need to have access to the Debug port. To get this, we have mainly two options that are the ones shown in the subsections below: Expose the port at the pod level and port-forwarding option.

Expose the port at the Pod Level

We need to have the debug port opened in our pod. To do that, we need to define another port that is not in use by the application, and it is not the default administration port (8090) to the one to be exposed. For example, in my case, I will use 5554 as the debug port, and to do that, I define another port to be accessed.

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes
Definition of the debug port as a Service

Port-Forwarding Option

Another option if we do not want to expose the debug port all the time, even if this is not going to be used unless we’re executing the remote debug, we have another option to do a port-forward to the debug port in our local.

kubectl port-forward hello-world-test-78b6f9b4b-cctgh 5554:5554

Connection to the TIBCO Business Studio

Now that we have everything ready, we need to connect our local TIBCO Business Studio to the pod, and to do that, we need to follow these steps:

Run → Debug Configurations, and we select the Remote BusinessWorks

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes
Selection of the Remote BusinessWorks application option in the Debug Configuration

And now we need to provide the connection details. For this case, we will use the localhost and port 5554 and click on the Debug button.

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes
Setting the connection properties for the Remote Debugging

After that moment, we will establish a connection between both environments: the pod running on our Kubernetes cluster and our local TIBCO BusinessStudio. And as soon as we hit the container, we can see the execution in our local environment:

#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes
Remote Debugging execution from our TIBCO BusinesStudio instance

Summary

I hope you find this interesting, and if you are one of those facing this issue now, you have information not to be stopped by this one. If you would like to submit your questions feel free to use one of the following options:

  • Twitter: You can send me a mention at @alexandrev on Twitter or a DM or even just using the hashtag #TIBFAQS that I will monitor.
  • Email: You can send me an email to alexandre.vazquez at gmail.com with your question.
  • Instagram: You can send me a DM on Instagram at @alexandrev
#TIBFAQS Enabling Remote Debugging for TIBCO BusinessWorks Application on Kubernetes

Level-Up Your Deployment Strategy with Canarying in Kubernetes

Level-Up Your Deployment Strategy with Canarying in Kubernetes

Save time and money on your application platform deploying applications differently in Kubernetes.

We have come from a time where we deploy an application using an apparent and straight-line process. The traditional way is pretty much like this:
  • Wait until a weekend or some time where the load is low, and business can tolerate some service unavailability.
  • We schedule the change and warn all the teams involved for that time to be ready to manage the impact.
  • We deploy the new version and we have all the teams during the functional test they need to do to ensure that is working fine, and we wait for the real load to happen.
  • We monitor during the first hours to see if something wrong happens, and in case it does, we establish a rollback process.
  • As soon as everything goes fine, we wait until the next release in 3–4 months.
But this is not valid anymore. Business demands IT to be agile, change quickly, and not afford to do that kind of resource effort each week or, even worse, each day. Do you think that it’s possible to gather all teams each night to deploy the latest changes? It is not feasible at all. So, technology advance to help us solve that issue better, and here is where Canarying came here to help us.

Introducing Canary Deployments

Canary deployments (or just Canarying as you prefer) are not something new, and a lot of people has been talking a lot about it: It has been here for some time, but before, it was neither easy nor practical to implement it. Basically is based on deploying the new version into production, but you still keeping the traffic pointing to the old version of the application and you just start shifting some of the traffic to the new version.
Level-Up Your Deployment Strategy with Canarying in Kubernetes
Canary release in Kubernetes environment graphical representation
Based on that small subset of requests you monitor how the new version performs at different levels, functional level, performance level, and so on. Once you feel comfortable with the performance that is providing you just shift all the traffic to the new version, and you deprecate the old version
Level-Up Your Deployment Strategy with Canarying in Kubernetes
Removal of old version after all traffic has been shifted to the newly deployed version.
The benefits that come with this approach are huge:
  • You don’t need a big staging environment as before because you can do some of the tests with real data into the production while not affecting your business and the availability of your services.
  • You can reduce time to market and increase the frequency of deployments because you can do it with less effort and people involved.
  • Your deployment window has been extended a lot as you do not need to wait for a specific time window, and because of that, you can deploy new functionality more frequently.

Implementing Canary Deployment in Kubernetes

To implement Canary Deployment in Kubernetes, we need to provide more flexibility to how the traffic is routed among our internal components, which is one of the capabilities that get extended from using a Service Mesh. We already discussed the benefits of using a Service Mesh as part of your environment, but if you would like to retake a look, please refer to this article: We have several technology components that can provide those capabilities, but this is how you will be able to create the traffic routes to implement this. To see how you can take a look at the following article about one of the default options that is Istio: But be able to route the traffic is not enough to implement a complete canary deployment approach. We also need to be able to monitor and act based on those metrics to avoid manual intervention. To do this, we need to include different tools to provide those capabilities: Prometheus is the de-facto option to monitor workloads deployed on the Kubernetes environment, and here you can get more info about how both projects play together. And to manage the overall process, you can have a Continuous Deployment tool to put some governance around that using options like Spinnaker or using our of the extensions for the Continuous integration tools like GitLab or GitHub:

Summary

In this article, we covered how we can evolve a traditional deployment model to keep pace with innovation that businesses require today and how canary deployment techniques can help us on that journey, and the technology components needed to set up this strategy in your own environment.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

Maximizing the productivity of working with Kubernetes Environment with a tool for each persona

We all know that Kubernetes is the default environment for all our new applications we developed we will build. The flavor of that Kubernetes platform can be of different ways and forms, but one thing is clear, it is complex.

The reasons behind this complexity is being able to provide all the flexibility but it is also true that the k8s project never has put much effort to provide a simple way to manage your clusters and kubectl is the point of access to send commands leaving this door open to the community to provide its own solution and these are the things that we are going to discuss today.

Kubernetes Dashboard: The Default Option

Kubernetes Dashboard is the default option for most of the installations. It is a web-based interface that is part of the K8s project but not deployed by default when you install the cluster

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

K9S: The CLI option

K9S is one of the most common options for the ones that love a very powerful command-line interface with a lot of options at your disposal

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

It is a mix between all the power of a command-line interface with all the keyboard options at your disposals with a very fancy graphical view to have a quick overview of the status of your cluster at glance.

Lens — The Graphical Option

The lens is a very vitaminized GUI option that goes beyond that just showing the status of the K8S cluster or allowing modifications on the components. With integration with other projects such as Helm or support for the CRD. It provides a very pleasant experience of managing clusters with multi-cluster support as well. To know more about Lens you can take a look at this article that we cover its main features:

Octant — The Web Option

Octant provides an improved experience compared with the default web option discussed in this article using the Kubernetes dashboard. Built for extension with a plug-in system that allows you to extend or customize the behavior of octant to maximize your productivity managing K8S clusters. Including CRD support and graphical visualization of dependencies provides an awesome experience.

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

Summary

The have provided in this article different tools that will help you during the important task to manage or inspect your Kubernetes cluster. Each of them with its own characteristics and each of them focuses on different ways to provide the information (CLI, GUI and Web) so you can always find one that works best for your situation and preferences.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

KEDA provides a rich environment to scale your application apart from the traditional HPA approach using CPU and Memory

Autoscaling is one of the great things of cloud-native environments and helps us to provide an optimized use of the operations. Kubernetes provides many options to do that being one of those the Horizontal Pod Autoscaler (HPA) approach.

HPA is the way Kubernetes has to detect if it is needed to scale any of the pods, and it is based on the metrics such as CPU usage or memory.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Sometimes those metrics are not enough to decide if the number of replicas we have available is enough. Other metrics can provide a better perspective, such as the number of requests or the number of pending events.

Kubernetes Event-Driven Autoscaling (KEDA)

Here is where KEDA comes to help. KEDA stands for Kubernetes Event-Driven Autoscaling and provides a more flexible approach to scale our pods inside a Kubernetes cluster.

It is based on scalers that can implement different sources to measure the number of requests or events that we receive from different messaging systems such as Apache Kafka, AWS Kinesis, Azure EventHub, and other systems as InfluxDB or Prometheus.

KEDA works as it is shown in the picture below:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

We have our ScaledObject that links our external event source (i.e., Apache Kafka, Prometheus ..) with the Kubernetes Deployment we would like to scale and register that in the Kubernetes cluster.

KEDA will monitor the external source, and based on the metrics gathered, will communicate the Horizontal Pod Autoscaler to scale the workload as defined.

Testing the Approach with a Use-Case

So, now that we know how that works, we will do some tests to see it live. We are going to show how we can quickly scale one of our applications using this technology. And to do that, the first thing we need to do is to define our scenario.

In our case, the scenario will be a simple cloud-native application developed using a Flogo application exposing a REST service.

The first step we need to do is to deploy KEDA in our Kubernetes cluster, and there are several options to do that: Helm charts, Operation, or YAML files. In this case, we are going to use the Helm charts approach.

So, we are going to type the following commands to add the helm repository and update the charts available, and then deploy KEDA as part of our cluster configuration:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda

After running this command, KEDA is deployed in our K8S cluster, and it types the following command kubectl get all will provide a situation similar to this one:

pod/keda-operator-66db4bc7bb-nttpz 2/2 Running 1 10m
pod/keda-operator-metrics-apiserver-5945c57f94-dhxth 2/2 Running 1 10m

Now, we are going to deploy our application. As already commented to do that we are going to use our Flogo Application, and the flow will be as simple as this one:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
Flogo application listening to the requests
  • The application exposes a REST service using the /hello as the resource.
  • Received requests are printed to the standard output and returned a message to the requester

Once we have our application deployed on our Kubernetes application, we need to create a ScaledObject that is responsible for managing the scalability of that component:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
ScaleObject configuration for the application

We use Prometheus as a trigger, and because of that, we need to configure where our Prometheus server is hosted and what query we would like to do to manage the scalability of our component.

In our sample, we will use the flogo_flow_execution_count that is the metric that counts the number of requests that are received by this component, and when this has a rate higher than 100, it will launch a new replica.

After hitting the service with a Load Test, we can see that as soon as the service reaches the threshold, it launch a new replica to start handling requests as expected.

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
Autoscaling being done using Prometheus metrics.

All of the code and resources are hosted in the GitHub repository shown below:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/


Summary

This post has shown that we have unlimited options in deciding the scalability options for our workloads. We can use the standard metrics like CPU and memory, but if we need to go beyond that, we can use different external sources of information to trigger that autoscaling.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed

Event-Driven architecture provides more agility to meet the changes of a more demanding customer ecosystem.

The market is shifting at a speed that is needed to be ready to change very quickly, customers are becoming more and more demanding and we need to be able to deliver what they are expecting, and to do so we need an architecture that is responsive enough to be able to adapt at the pace that is required.

Event-Driven Architectures (usually just referred to as EDA) are architectures where events are the crucial part of it and we design components ready to handle those events in the most efficient way. An architecture that is ready to react to what’s happening around us instead of just setting a specific path for our customers.

This approach provides a lot of benefits to enterprises because of its characteristics but also at the same time it requires a different mindset and a different set of components in place.

What is an Event?

Let’s start with the beginning. An event is anything that can happen and it is important to you. If you think about a scenario where a user is just navigating through an e-commerce website, everything that he has is an event. If we land on the e-commerce site because he had a referral link, that is an event.

Events not only happen in virtual life but in real life too. A person just walking into the lobby of the hotel is an event, going in front of the reception desk to do the check-in is another, just walking to his room is another… everything is an event.

Events in isolation provide a small piece of information but together they can provide a lot of valuable information about the customers, their preferences, their expectations, and also their needs. And all of that will help us to provide the most customized experience to each one of our customers.

EDA vs Traditional Architectures

Traditional architectures work in pull mode, which means that a consumer sends a request to a service, that services need other components to do the logic, it goes the answer and it answers back. Everything is pre-defined.

Events work in a different way because they work on the push mode, Events are being sent and that’s it, it could trigger one action, many actions, or none. You have a series of components waiting, listening until the event or the sequence of events they need to activate appears in front of them and when it does, it just triggers its logic and as part of that execution generates one or more events to be able to be consumed again.

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed
Pull vs Push mode for Communication.

To be able to build an Event-Driven Architecture the first thing we need is to have Event-Driven Components. We need software components that are activated based on events and also generate events as part of their processing logic. At the same time, this sequence of events also becomes the way to complete complex flows in a cooperation mode without the need or a master-mind component that is aware of all the flow from end to end.

You just have components that know that when happens this, they need to do their part of the job and other components will listen to the output of that components and be activated.

This approach is called Choreography because it works the same way in a ballet company where each of the dancers can be doing different moves but each of them knows exactly what they should do and all together in sync generate the whole piece.

Layers of an Event-Driven Architecture

Now that we have software components that are being activated using events we need some structure around that in our architecture to cover all the needs in the management of the events, so we need to handle the following layers:

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed
Layers of the Event Driven-Architecture
  • Event Ingestion: We need a series of components that helps us to introduce and receive events in our systems. As we explained there are tons and tons of ways to send events so it is important that we offer flexibility and options in that process. Adapters and API are crucial here to make sure all the events can be gathered and be part of the system.
  • Event Distribution: We need an Event Bus that acts like our Event Ocean where all the events are flowing across to be able to activate all the components that are listening to that event.
  • Event Processing: We need a series of components to listen to all the events that are sent and make them meaningful. These components should act as security guards: They filter the events that are not important, they also enrich the events they receive with context information from other systems or data sources, and they transform the format of some events to make it easy to understand to all the components that are waiting for those events.
  • Event Action: We need a series of components listening to those events and ready to react to what is seen in the Event Bus as soon as detect that they expect to start doing their logic and send the output again to the bus to be used for somebody else.

Summary

Event-Driven Architecture can provide a much more agile and flexible ecosystem where companies can address the current challenges to dispose a compelling experience to users and customers and at the same time provide more agility to the technical teams being able to create components that work in collaboration but loosely coupled making the components and teams more autonomous.

Kubernetes Health Checks Explained: Simplify Cluster Diagnostics with KubeEye

Kubernetes Health Checks Explained: Simplify Cluster Diagnostics with KubeEye

KubeEye supports you in the task of ensuring that your cluster is performing well and ensure all your best practices are being followed.

Kubernetes has become the new normal to deploy our applications and other serverless options, so the administration of these clusters has become critical for most enterprises, and doing a proper Kubernetes Health Check is becoming critical.

This task is clear that it is not an easy task. As always, the flexibility and power that technology provides to the users (in this case, the developers) also came with a trade-off with the operation and management’s complexity. And this is not an exception to that.

We have evolved, including managed options that simplify all the underlying setup and low-level management of the infrastructure behind it. However, many things need to be done for the cluster administration to have a happy experience in the journey of a Kubernetes Administrator.

A lot of concepts to deal with: namespaces, resource limits, quotas, ingress, services, routes, crd… Any help that we can get is welcome. And with this purpose in mind, KubeEye has been born.

KubeEye is an open-source project that helps to identify some issues in our Kubernetes Clusters. Using their creators’ words:

KubeEye aims to find various problems on Kubernetes, such as application misconfiguration(using Polaris), cluster components unhealthy and node problems(using Node-Problem-Detector). Besides predefined rules, it also supports custom defined rules.

So we can think like a buddy that is checking the environment to make sure that everything is well configured and healthy. Also, it allows us to define custom rules to make sure that all the actions that the different dev teams are doing are according to the predefined standards and best practices.

So let’s see how we can include KubeEye to do a health check of our environment. The first thing we need to do is to install it. At this moment, KubeEye only offers a release for Linux-based system, so if you are using other systems like me, you need to follow another approach and type the following commands:

git clone https://github.com/kubesphere/kubeeye.git
cd kubeeye
make install

After doing that, we end up with a new binary in our PATH named `ke`, and this is the only component needed to work with the app. The second step we need to do to get more detail on those diagnostics is to install the node problem detector component.

This component is a component installed in each node of the cluster. It helps to make more visible to the upstream layers issues regarding the behavior of the Kubernetes cluster. This is an optional step, but it will provide more meaningful data, and install that, we need to run the following command.

ke install npd

And now we’re ready to start checking our environment, and the order is as easy as this one.

ke diag

This will provide an output similar to this that is compounded by two different tables. The first one will be focused on the Pod and the issues and events raised as part of the platform’s status, and the other will focus on the rest of the elements and kinds of objects for the Kubernetes Clusters.

Kubernetes Health Checks Explained: Simplify Cluster Diagnostics with KubeEye
Output from the ke diag command

The table for the issues at the pod level has the following fields:

  • Namespace where the pod belongs to.
  • Severity of the issue.
  • Pod Name that is responsible for the issue
  • EventTime of where this event has been raised
  • Reason for the issue
  • Message with the detailed description of the issue

The second table for the other objects has the following structure:

  • Namespace where the object that has an issue that is being detected is deployed.
  • Severity of the issue.
  • Name of the component
  • Kind of the component
  • Time of where this issue has been raised
  • Message with the detailed description of the issue

Command’s output can also show other tables if some issues are detected at the node level.


Today we cover a fascinating topic as it is the Kubernetes Administration and introduce a new tool that helps your daily task.

I truly expect that this tool can be added to your toolbox and ease the path for a happy and healthy Kubernetes Cluster administration!

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Improving Development Security with Open Source DevSecOps Tools (Syft & Grype)

Improving Development Security with Open Source DevSecOps Tools (Syft & Grype)

Discover how Anchore can help you to keep your software safe and secure without losing agility.

Development Security is one of the big topics of today’s development practice. All the improvements that we got following the DevOps practices have generated many issues and concerns from the security perspective.

The explosion of components that the security teams need to deal with, container approaches, and polyglot environments gave us many benefits from the development and the operational perspective. Still, it made the security side of it more complex.

This is why there have been many movements regarding the “Shift left” approach and including security as part of the DevOps process creating the new term for DevSecOps that is becoming the new normal.

So, today what I would like to bring to you is a set of tools that I have just discovered that are created with the approach of making your life easier from the development security perspective because also developers need to be part of this and not leave all the responsibility to a different team.

This set of tools is name Anchore Toolbox, and they are open source and free to use, as you can see on the official webpage (https://anchore.com/opensource/)

So, what Anchore can provide to us? At the moment, we are talking about two different applications: Syft and Grype.

Syft

Syft is a CLI tool and go library for generating a Software Bill of Materials (SBOM) from container images and filesystems. Installation is as easy as just executing the following command:

curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin

And after doing that, we need to type syft to see all the options at our disposal:

Improving Development Security with Open Source DevSecOps Tools (Syft & Grype)
Syft help menu with all the options available

So, in our case, I will use to generate a bill of materials from an existing Docker image from bitnami/kafka to show how this works. I need to type the following command:

syft bitnami/kafka

And after a few seconds to have the image loaded and analyzed, I get as the output the list of all and each of the packages that this image has installed and the version of each of them as shown in the picture below. One great thing is that it shows not only the operating system packages like what we have installed using apk or apt but also other components like java libraries as well so we can have a complete bill of materials for this container image.

 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged image [204 packages]
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-javadoc.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-javadoc.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/scala-java8-compat_2.12–0.9.1.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/scala-java8-compat_2.12–0.9.1.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-test-sources.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-test-sources.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/jackson-module-scala_2.12–2.10.5.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/jackson-module-scala_2.12–2.10.5.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka-streams-scala_2.12–2.7.0.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka-streams-scala_2.12–2.7.0.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-test.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-test.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/scala-collection-compat_2.12–2.2.0.jar’
[0019] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/scala-collection-compat_2.12–2.2.0.jar’
[0020] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0.jar’
[0020] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0.jar’
[0020] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-sources.jar’
[0020] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/kafka_2.12–2.7.0-sources.jar’
[0020] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/scala-logging_2.12–3.9.2.jar’
[0020] WARN unexpectedly empty matches for archive ‘/opt/bitnami/kafka/libs/scala-logging_2.12–3.9.2.jar’
NAME VERSION TYPE
 java-archive
acl 2.2.53–4 deb
activation 1.1.1 java-archive
adduser 3.118 deb
aopalliance-repackaged 2.6.1 java-archive
apt 1.8.2.2 deb
argparse4j 0.7.0 java-archive
audience-annotations 0.5.0 java-archive
base-files 10.3+deb10u8 deb
base-passwd 3.5.46 deb
bash 5.0–4 deb
bsdutils 1:2.33.1–0.1 deb
ca-certificates 20200601~deb10u2 deb
com.fasterxml.jackson.module.jackson.module.scala java-archive
commons-cli 1.4 java-archive
commons-lang3 3.8.1 java-archive
...

Grype

Grype is a vulnerability scanner for container images and filesystems. It is the next step because it checks the image’s components and checks if there is any known vulnerability.

To install this component again is as easy as type the following command:

curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin

After doing that, we need to type grype to have the help menu with all the options at our disposal:

Improving Development Security with Open Source DevSecOps Tools (Syft & Grype)
Grype help menu with all the options available

Grype works in the following one. The first thing it does is load the vulnerability DB to check the different packages against this database to search for any known vulnerability. After doing that, follow the same pattern as syft and generate the bill of materials and check each of the components into the vulnerability database, and if there is a match. It just provides the ID of the vulnerability, the severity, and, if this is fixed into a higher version, provides the version where this vulnerability has been fixed.

Here you can see the output regarding the same image from bitnami/kafka with all the vulnerabilities detected

grype bitnami/kafka
 ✔ Vulnerability DB [updated]
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged image [204 packages]
 ✔ Scanned image [149 vulnerabilities]
[0018] ERROR matcher failed for pkg=Pkg(type=java-archive, name=, version=): matcher failed to fetch by CPE pkg=’’: product name is required
[0018] ERROR matcher failed for pkg=Pkg(type=java-archive, name=, version=): matcher failed to fetch by CPE pkg=’’: product name is required
[0018] ERROR matcher failed for pkg=Pkg(type=java-archive, name=, version=): matcher failed to fetch by CPE pkg=’’: product name is required
[0018] ERROR matcher failed for pkg=Pkg(type=java-archive, name=, version=): matcher failed to fetch by CPE pkg=’’: product name is required
[0018] ERROR matcher failed for pkg=Pkg(type=java-archive, name=, version=): matcher failed to fetch by CPE pkg=’’: product name is required
[0018] ERROR matcher failed for pkg=Pkg(type=java-archive, name=, version=): matcher failed to fetch by CPE pkg=’’: product name is required
NAME INSTALLED FIXED-IN VULNERABILITY SEVERITY
apt 1.8.2.2 CVE-2011–3374 Negligible
bash 5.0–4 CVE-2019–18276 Negligible
commons-lang3 3.8.1 CVE-2013–1907 Medium
commons-lang3 3.8.1 CVE-2013–1908 Medium
coreutils 8.30–3 CVE-2016–2781 Low
coreutils 8.30–3 CVE-2017–18018 Negligible
curl 7.64.0–4+deb10u1 CVE-2020–8169 Medium
..

Summary

These simple CLI tools help us a lot in the needed journey to keep our software current and free of known vulnerabilities and improve our development security. Also, as these are CLI apps and also can run on containers, it is effortless to include those as part of your CICD pipeline so vulnerabilities can check in an automated way.

They also provided a plugin to be included in the most used CI/CD systems such as Jenkins, Cloudbees, CircleCI, GitHub Actions, Bitbucket, Azure DevOps, and so on.

How to Check TIBCO BusinessWorks Configuration at Runtime (OSGi lcfg Command)

How to Check TIBCO BusinessWorks Configuration at Runtime (OSGi lcfg Command)

Discover how the OSGI lcfg command can help you be sure which is the configuration at runtime.

Knowing the TIBCO BW configuration at runtime is became critical as you always need to know if the latest changes has been applied or just want to check the specific value for a Module Property as part of your development.

This article is part of my comprehensive TIBCO Integration Platform Guide where you can find more patterns and best practices for TIBCO integration platforms.

When we are talking about applications deployed on the cloud one of the key things is Configuration Management. Especially if we include into the mix things like Kubernetes, Containers, External Configuration Management System things got tricky.

Usual configuration when we are talking about a Kubernetes environment for configuration management is the use of Config Maps or Spring Cloud Config.

When you can upload the configuration in a separate step as deploying the application, you can get into a situation where you are not sure about what is the running configuration that a BusinessWorks application has.

To check TIBCO BW configuration there is an easy way to know exactly the current values:

  • We just need to get inside the container to be able to access the internal OSGI console that allows us to execute administrative commands.
  • We have spoken other times about that API but in case you would like to take a deeper look you just need to check this link:
  • And one of the commands is lcfg that allows knowing which configuration is being used by the application that is running:
curl localhost:8090/bw/framework.json/osgi?command=lcfg

With an output similar to this:

How to Check TIBCO BusinessWorks Configuration at Runtime (OSGi lcfg Command)
Sample output for the lcfg command of a Running BusinessWorks Container Application

Summary

I hope you find this interesting, and if you are one of those facing this issue now, you have information not to be stopped by this one. If you would like to submit your questions feel free to use one of the following options:

  • Twitter: You can send me a mention at @alexandrev on Twitter or a DM or even just using the hashtag #TIBFAQs that I will monitor.
  • Email: You can send me an email to alexandre.vazquez at gmail.com with your question.
  • Instagram: You can send me a DM on Instagram at @alexandrev
How to Check TIBCO BusinessWorks Configuration at Runtime (OSGi lcfg Command)