Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

Apache Kafka seems to be the standard solution in nowadays architecture, but we should focus if it is the right choice for our needs.

Nowadays, we’re in a new age of Event-Driven Architecture, and this is not the first time we’ve lived that. Before microservices and cloud, EDA was the new normal in enterprise integration. Based on different kinds of standards, there where protocols like JMS or AMQP used in broker-based products like TIBCO EMS, Active MQ, or IBM Websphere MQ, so this approach is not something new.

With the rise of microservices architectures and the API lead approach, it seemed that we’ve forgotten about the importance of the messaging systems, and we had to go through the same challenges we saw in the past to come to a new messaging solution to solve that problem. So, we’re coming back to EDA Architecture, pub-sub mechanism, to help us decouple the consumers and producers, moving from orchestration to choreography, and all these concepts fit better in nowaday worlds with more and more independent components that need cooperation and integration.

During this effort, we started to look at new technologies to help us implement that again. Still, with the new reality, we forgot about the heavy protocols and standards like JMS and started to think about other options. And we need to admit that we felt that there is a new king in this area, and this is one of the critical components that seem to be no matter what in today’s architecture: Apache Kafka.

And don’t get me wrong. Apache Kafka is fantastic, and it has been proven for so long, a production-ready solution, performant, with impressive capabilities for replay and powerful API to ease the integration. Apache Kafka has some challenges in this cloud-native world because it doesn’t play so well with some of its rules.

If you have used Apache Kafka for some time, you are aware that there are particular challenges with it. Apache Kafka has an architecture that comes from its LinkedIn days in 2011, where Kubernetes or even Docker and container technologies were not a thing, that makes to run Apache Kafka (purely stateful service) in a container fashion quite complicated. There are improvements using helm charts and operators to ease the journey, but still, it doesn’t feel like pieces can integrate well into that fashion. Another thing is the geo-replication that even with components like MirrorMaker, it is not something used, works smooth, and feels integrated.

Other technologies are trying to provide a solution for those capabilities, and one of them is also another Apache Foundation project that has been donated by Yahoo! and it is named Apache Pulsar.

Don’t get me wrong; this is not about finding a new truth, that single messaging solution that is perfect for today’s architectures: it doesn’t exist. In today’s world, with so many different requirements and variables for the different kinds of applications, one size fits all is no longer true. So you should stop thinking about which messaging solution is the best one, and think more about which one serves your architecture best and fulfills both technical and business requirements.

We have covered different ways for general communication, with several specific solutions for synchronous communication (service mesh technologies and protocols like REST, GraphQL, or gRPC) and different ones for asynchronous communication. We need to go deeper into the asynchronous communication to find what works best for you. But first, let’s speak a little bit more about Apache Pulsar.

Apache Pulsar

Apache Pulsar, as mentioned above, has been developed internally by Yahoo! and donated to the Apache Foundation. As stated on their official website, they are several key points to mention as we start exploring this option:

  • Pulsar Functions: Easily deploy lightweight compute logic using developer-friendly APIs without needing to run your stream processing engine
  • Proven in production: Apache Pulsar has run in production at Yahoo scale for over three years, with millions of messages per second across millions of topics
  • Horizontally scalable: Seamlessly expand capacity to hundreds of nodes
  • Low latency with durability: Designed for low publish latency (< 5ms) at scale with strong durability guarantees
  • Geo-replication: Designed for configurable replication between data centers across multiple geographic regions
  • Multi-tenancy: Built from the ground up as a multi-tenant system. Supports Isolation, Authentication, Authorization, and Quotas
  • Persistent storages: Persistent message storage based on Apache BookKeeper. Provides IO-level isolation between write and read operations
  • Client libraries: Flexible messaging models with high-level APIs for Java, C++, Python and GO
  • Operability: REST Admin API for provisioning, administration, tools, and monitoring. Deploy on bare metal or Kubernetes.

So, as we can see, in its design, Apache Pulsar is addressing some of the main weaknesses of Apache Kafka as Geo-replication and their cloud-native approach.

Apache Pulsar provides support for the pub/sub pattern, but also provides so many capabilities that also place as a traditional queue messaging system with their concept of exclusive topics where only one of the subscribers will receive the message. Also provides interesting concepts and features used in other messaging systems:

  • Dead Letter Topics: For messages that were not able to be processed by the consumer.
  • Persistent and Non-Persistent Topics: To decide if you want to persist your messages or not during the transition.
  • Namespaces: To have a logical distribution of your topics, so an application can be grouped in namespaces as we do, for example, in Kubernetes so we can isolate some applications from the others.
  • Failover: Similar to exclusive, but when the attached consumer failed to process another takes the chance to process the messages.
  • Shared: To be able to provide a round-robin approach similar to the traditional queue messaging system where all the subscribers will be attached to the topic, but the only one will receive the message, and it will distribute the load along all of them.
  • Multi-topic subscriptions: To be able to subscribe to several topics using a regexp (similar to the Subject approach from TIBCO Rendezvous, for example, in the 90s) that has been so powerful and popular.

But also, if you require features from Apache Kafka, you will still have similar concepts as partitioned topics, key-shared topics, and so on. So you have everything at your hand to choose which kind of configuration works best for you and your specific use cases, you also have the option to mix and match.

Apache Pulsar Architecture

Apache Pulsar Architecture is similar to other comparable messaging systems today. As you can see in the picture below from the Apache Pulsar website, those are the main components of the architecture:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense
  • Brokers: One or more brokers handles incoming messages from producers, dispatches messages to consumers
  • BookKeeper Cluster for persistent storage of messages management
  • ZooKeeper Cluster for management purposes.

So you can see this architecture is also quite similar to the Apache Kafka one again with the addition of a new concept of the BookKeeper Cluster.

Broker in Apache Pulsar are stateless components that mainly will run two pieces

  • HTTP Server that exposes a REST API for management and is used by consumers and producers for topic lookup.
  • TCP Server using a binary protocol called dispatcher that is used for all the data transfers. Usually, Messages are dispatched out of a managed ledger cache for performance purposes. But also if this cache grows too big, it will interact with the BookKeeper cluster for persistence reasons.

To support the Global Replication (Geo-Replication), the Brokers manage replicators that tail the entries published in the local region and republish them to the remote regions.

Apache BookKeeper Cluster is used as persistent message storage. Apache BookKeeper is a distributed write-ahead log (WAL) system that manages when messages should be persisted. It also supports horizontal scaling based on the load and multi-log support. Not only messages are persistent but also the cursors that are the consumer position for a specific topic (similar to the offset in Apache Kafka terminology)

Finally, Zookeeper Cluster is used in the same role as Apache Kafka as a metadata configuration storage cluster for the whole system.

Hello World using Apache Pulsar

Let’s see how we can create a quick “Hello World” case using Apache Pulsar as a protocol, and to do that, we’re going to try to implement it in a cloud-native fashion. So we will do a single-node cluster of Apache Pulsar in a Kubernetes installation and deploy a producer application using Flogo technology and a consumer application using Go. Something similar to what you can see in the diagram below:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense
Diagram about the test case we’re doing

And we’re going to try to keep it simple, so we will just use pure docker this time. So, first of all, just spin up the Apache Pulsar server and to do that we will use the following command:

docker run -it -p 6650:6650 -p 8080:8080 --mount source=pulsardata,target=/pulsar/data --mount source=pulsarconf,target=/pulsar/conf apachepulsar/pulsar:2.5.1   bin/pulsar standalone

And we will see an output similar to this one:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

Now, we need to create simple applications, and for that, Flogo and Go will be used.

Let’s start with the producer, and in this case, we will use the open-source version to create a quick application.

First of all, we will just use the Web UI (dockerized) to do that. Run the command:

docker run -it -p 3303:3303 flogo/flogo-docker eula-accept

And we install a new contribution to enable the Pulsar publisher activity. To do that we will click on the “Install new contribution” button and provide the following URL:

flogo install github.com/mmussett/flogo-components/activity/pulsar

And now we will create a simple flow as you can see in the picture below:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

We will now build the application using the menu, and that’s it!

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

To be able to run just launch the application as you can see here:

./sample-app_linux_amd64

Now, we just need to create the Go-lang consumer to be able to do that we need to install the golang package:

go get github.com/apache/pulsar-client-go/pulsar

And now we need to create the following code:

package main
import (
 “fmt”
 “log”
 “github.com/apache/pulsar-client-go/pulsar”
)
func main() {
 client, err := pulsar.NewClient(pulsar.ClientOptions{URL: “pulsar://localhost:6650”})
 if err != nil {
 log.Fatal(err)
 }
defer client.Close()
channel := make(chan pulsar.ConsumerMessage, 100)
options := pulsar.ConsumerOptions{
 Topic: “counter”,
 SubscriptionName: “my-subscription”,
 Type: pulsar.Shared,
 }
options.MessageChannel = channel
consumer, err := client.Subscribe(options)
 if err != nil {
 log.Fatal(err)
 }
defer consumer.Close()
// Receive messages from channel. The channel returns a struct which contains message and the consumer from where
 // the message was received. It’s not necessary here since we have 1 single consumer, but the channel could be
 // shared across multiple consumers as well
 for cm := range channel {
 msg := cm.Message
 fmt.Printf(“Received message msgId: %v — content: ‘%s’\n”,
 msg.ID(), string(msg.Payload()))
consumer.Ack(msg)
}
}

And after running both programs, you can see the following output as you can see, we were able to communicate both applications in an effortless flow.

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

This article is just a starting point, and we will continue talking about how to use Apache Pulsar in your architectures. If you want to take a look at the code we’ve used in this sample, you can find it here:

Welcome to the AsyncAPI Revolution!

Welcome to the AsyncAPI Revolution!

We’re living in an age where technologies are switching standards are changing all the time. You forget to read Medium/Stackoverflow/Reddit and you found there are at least five (5) new industry standards that are taking the place of the existing ones that you know (those that have been releasing something like a year ago 🙂 ).

Do you still remember the old ages when SOAP was the unbeatable format? How much time did we spend building our SOAP Services in our enterprises? REST replace it as the new standard.. but just a few years and we’re back in a new battle just for synchronous communication: gRPC, GraphQL, are here to conquer everything again. It is crazy, huh?

But the situation is similar to asynchronous communication. Asynchronous communication has been here for a long time. Even, a long time before the terms Event-Driven Architecture or Streaming was really a “cool” term or a thing you be aware of.

We’ve been using these patterns for so long in our companies. Big enterprises have been using this model into their enterprise integrations for so long. Pub/Sub based protocols and technologies like TIBCO Rendezvous has been using since the late 90, and then we also incorporate more standards approaches like JMS using a different kind of servers to have all these event-based communications.

But now with the cloud-native revolution, the need for distributed computing, more agility, more scalability, centralized solutions are not valid anymore, and we’ve seen an explosion in the number of options to communicate based on these patterns.

You could think that this is the same situation as we were discussing at the beginning of this article regarding REST predominance and new cutting-edge technologies trying to replace it, but this is something quite different. Because experience has told us that a single size doesn’t fit all.

You cannot find a single technology or component that can provide all the communication needs that you need for all your use-cases. You can name any technology or protocol that you want: Kafka, Pulsar, JMS, MQTT, AMQP, Thrift, FTL, and so on.

Think about each of them and you probably will find some use-cases that one technology plays better than the others, so it makes no sense to just trying to find a single technology solution to cover all the needs. What it is needed is more a polyglot approach when you have different technologies that play well together and use the one that works best for your use case (the right tool for the right job approach) as we’re doing for the different technologies we’re deploying in our cluster.

Probably we’re not going to use the same technology to do a Machine Learning based Microservice, than a Streaming Application, right? The same principle applies here.

But the problem here when we try to talk about different technologies playing together is about standardization. If we think about REST, gRPC, or GraphQL even that they’re different they play based some common grounds. They rely on the same base HTTP protocol for a standard so it is easy to support all of them in the same architecture.

But this is not true with the technologies about Asynchronous Communication. And I’d like to focus on standardization and specification today. And that’s what AsyncAPI Initiative is trying to solve. And to define what AsyncAPI is I’d like to use their own words from their official website:

AsyncAPI is an open source initiative that seeks to improve the current state of Event-Driven Architectures (EDA). Our long-term goal is to make working with EDA’s as easy as it is to work with REST APIs. That goes from documentation to code generation, from discovery to event management. Most of the processes you apply to your REST APIs nowadays would be applicable to your event-driven/asynchronous APIs too.

So, their goal is to provide a set of tools to have a better world in all those EDA architectures that all companies have or starting to have at this moment and everything pivots around one thing: The OpenAPI Specification.

Similar to the OpenAPI specification it allows us to define a common interface for our EDA Interfaces and the most important part is that this is multi-channel. So the same specification can be used for your MQTT-based API or your Kafka API. Let’s take a look at how it looks like this AsyncAPI Specification:

Welcome to the AsyncAPI Revolution!
AsyncAPI 2.0 Definition (from https://www.asyncapi.com/docs/getting-started/coming-from-openapi/)

As you can see it is very similar to the OpenAPI 3.0 and they already done that with the purpose to ease the transition between OpenAPI 3.0 and AsyncAPI and also to try to join both worlds together: It is more about just API, no matter if they’re synchronous or asynchronous and provide the same benefits regarding the ecosystem from one to the other.

Show me the code!!

But let’s stop talking and let’s start coding and to do that I’d like to use one of the tools that in my view has the greater support for AsyncAPI, and that’s it Project Flogo.

Probably you remember some of the different posts I’ve been done regarding Project Flogo and TIBCO Flogo Enterprise as a great technology to use for your microservices development (low-code/all-code approach, Golang based, a lot of connectors and open source extensions as well).

But today we’re going to use it to create our first AsyncAPI compliant microservice. And we’re going to rely on that because it provides a set of extensions to support the AsyncAPI initiative as you can see here:

So the first thing that we’re going to do is to create our AsyncAPI definition and to do it simpler, we’re going to use the sample one that we have available in the OpenAsync API with a simple change: We’re going to change from AMQP protocol to Kafka protocol because this is cool these days, isn’t it? 😉

asyncapi: '2.0.0'
info:
  title: Hello world application
  version: '0.1.0'
servers:
  production:
    url: broker.mycompany.com
    protocol: kafka
    description: This is "My Company" broker.
    security:
      - user-password: []
channels:
  hello:
    publish:
      message:
        $ref: '#/components/messages/hello-msg'
  goodbye:
    publish:
      message:
        $ref: '#/components/messages/goodbye-msg'
components:
  messages:
    hello-msg:
      payload:
        type: object
        properties:
          name:
            type: string
          sentAt:
            $ref: '#/components/schemas/sent-at'
    goodbye-msg:
      payload:
        type: object
        properties:
          sentAt:
            $ref: '#/components/schemas/sent-at'
  schemas:
    sent-at:
      type: string
      description: The date and time a message was sent.
      format: datetime
  securitySchemes:
    user-password:
      type: userPassword

As you can see something simple. Two operations “hello” and “goodbye” with easy payload:

  • name: Name that we’re going to use for the greeting.
  • sentAt: The date and time a message was sent.

So the first thing we’re going to do is to create a Flogo Application that complies to that AsyncAPI specification:

git clone https://github.com/project-flogo/asyncapi.git
cd asyncapi/
go install

Now we have the generator installer so we only need to execute and provide our YML as the input in the following command:

asyncapi -input helloworld.yml -type flogodescriptor

And we will create a HelloWorld application for us, that we need to tweak a little bit. Only to make you be up & running quickly, I’m just sharing the code in my GitHub repository that you can borrow from them (But I really encourage you to take the time to take a look at the code to see the beauty of the Flogo App Development 🙂 )

https://github.com/project-flogo/asyncapi

Now, that we already have the app, we have just a simple dummy application that allows us to receive the message that complies with the specification, and in our case just log the payload, which can be our starting point to build our new Event-Driven Microservices compliant with AsyncAPI.

So, let’s try it but to do so, we need a few things. First of all, we need a Kafka server running and to do that in a quick way we’re going to leverage on the following docker-compose.yml file:

version: '2'
services:
  zookeeper:
    image: wurstmeister/zookeeper:3.4.6
    expose:
    - "2181"
  kafka:
    image: wurstmeister/kafka:2.11-2.0.0
    depends_on:
    - zookeeper
    ports:
    - "9092:9092"
    environment:
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:9092
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181

And to run that we just need to fire the following command from the same folder we have this file named as docker-compose.yml:

docker-compose up -d

And after doing that, we just need a sample application and what better that use Flogo again to create it but this time, let’s use the Graphical Viewer to create it right away:

Welcome to the AsyncAPI Revolution!
Simple Flogo Application to send a AsyncAPI complaint-message each minute using Kafka as a protocol

So we need just to configure the Publish Kafka activity to provide the broker (localhost:9092), the topic (hello) and the message :

{
"name": "hello world",
"sentAt": "2020-04-24T00:00:00"
}

And that’s it! Let’s run it!!!:

First we start the AsyncAPI Flogo Microservice:

Welcome to the AsyncAPI Revolution!
Async API Flogo Microservices Started!

And then we just launch the tester, that is going to send the same message each minute, as you can see in the picture below:

Welcome to the AsyncAPI Revolution!
Sample Tester sending sample messages

And each time we sent that message, is going to be received in our Async API Flogo Microservice:

Welcome to the AsyncAPI Revolution!

So, I hope this first introduction to the AsyncAPI world has been of the interest of you, but don’t forget to take a look at more resources in their own website:

Kubernetes Batch Processing with TIBCO BusinessWorks: Jobs, Patterns, and Use Cases

Kubernetes Batch Processing with TIBCO BusinessWorks: Jobs, Patterns, and Use Cases
Table Of Contents

Add a header to begin generating the table of contents

We all know that in the rise of the cloud-native development and architectures, we’ve seen Kubernetes based platforms as the new standard all focusing on new developments following the new paradigms and best practices: Microservices, Event-Driven Architectures new shiny protocols like GraphQL or gRPC, and so on and so forth.

This article is part of my comprehensive TIBCO Integration Platform Guide where you can find more patterns and best practices for TIBCO integration platforms.

!– /wp:paragraph –>

Prometheus Monitoring in TIBCO Cloud Integration

Prometheus Monitoring in TIBCO Cloud Integration

In previous posts, I’ve explained how to integrate TIBCO BusinessWorks 6.x / BusinessWorks Container Edition (BWCE) applications with Prometheus, one of the most popular monitoring systems for cloud layers. Prometheus is one of the most widely used solutions to monitor your microservices inside a Kubernetes cluster. In this post, I will explain steps to leverage Prometheus for integrating with applications running on TIBCO Cloud Integration (TCI).

TCI is TIBCO’s iPaaS and primarily hides the application management complexity of an app from users. You need your packaged application (a.k.a EAR) and manifest.json — both generated by the product to simply deploy the application.

Isn’t it magical? Yes, it is! As explained in my previous post related to Prometheus integration with BWCE, which allows you to customize your base images, TCI allows integration with Prometheus in a slightly different manner. Let’s walk through the steps.

TCI has its own embedded monitoring tools (shown below) to provide insights into Memory and CPU utilization, plus network throughput, which is very useful.

While the monitoring metrics provided out-of-the-box by TCI are sufficient for most scenarios, there are hybrid connectivity use-cases (application running on-prem and microservices running on your own cluster that could be on a private or public cloud) that might require a unified single-pane view of monitoring.

Step one is to import the Prometheus plugin from the current GitHub location into your BusinessStudio workspace. To do that, you just need to clone the GitHub Repository available here: https://github.com/TIBCOSoftware/bw-tooling OR https://github.com/alexandrev/bw-tooling

Import the Prometheus plugin by choosing Import → Plug-ins and Fragments option and specifying the directory downloaded from the above mentioned GitHub location. (shown below)

Prometheus Monitoring in TIBCO Cloud Integration
Prometheus Monitoring in TIBCO Cloud Integration

Step two involves adding the Prometheus module previously imported to the specific application as shown below:

Prometheus Monitoring in TIBCO Cloud Integration

Step three is just to build the EAR file along with manifest.json.

NOTE: If the EAR doesn’t get generated once you add the Prometheus plugin, please follow the below steps:

  • Export the project with the Prometheus module to a zip file.
  • Remove the Prometheus project from the workspace.
  • Import the project from the zip file generated before.

Before you deploy the BW application on TCI, we need to enable an additional port on TCI to scrape the Prometheus metrics.

Step four Updating manifest.json file.

By default, a TCI app using the manifest.json file only exposes one port to be consumed from outside (related to functional services) and the other to be used internally for health checks.

Prometheus Monitoring in TIBCO Cloud Integration

For Prometheus integration with TCI, we need an additional port listening on 9095, so Prometheus server can access the metrics endpoints to scrape the required metrics for our TCI application.

Note: This document does not cover the details on setting the Prometheus server (it is NOT needed for this PoC) but you can find the relevant information on https://prometheus.io/docs/prometheus/latest/installation/

We need to slightly modify the generated manifest.json file (of BW app) to expose an additional port, 9095 (shown below) .

Prometheus Monitoring in TIBCO Cloud Integration

Also, to tell TCI that we want to enable Prometheus endpoint we need to set a property in the manifest.json file. The property is TCI_BW_CONFIG_OVERRIDES and provide the following value: BW_PROMETHEUS_ENABLE=true, as shown below:

Prometheus Monitoring in TIBCO Cloud Integration

We also need to add an additional line (propertyPrefix) in the manifest.json file as shown below.

Prometheus Monitoring in TIBCO Cloud Integration

Now, we are ready to deploy the BW app on TCI and once it is deployed we can see there are two endpoints

Prometheus Monitoring in TIBCO Cloud Integration

If we expand the Endpoints options on the right (shown above), you can see that one of them is named “prometheus” and that’s our Prometheus metrics endpoint:

Just copy the prometheus URL and append it with /metrics (URL in the below snapshot) — this will display the Prometheus metrics for the specific BW app deployed on TCI.

Note: appending with /metrics is not compulsory, the as-is URL for Prometheus endpoint will also work.

Prometheus Monitoring in TIBCO Cloud Integration

In the list you will find the following kind of metrics to be able to create the most incredible dashboards and analysis based on that kind of information:

  • JVM metrics around memory used, GC performance and thread pools counts
  • CPU usage by the application
  • Process and Activity execution counts by Status (Started, Completed, Failed, Scheduled..)
  • Duration by Activity and Process.

With all this available the information you can create dashboards similar to the one shown below, in this case using Spotfire as the Dashboard tool:

Prometheus Monitoring in TIBCO Cloud Integration

But you can also integrate those metrics with Grafana or any other tool that could read data from Prometheus time-series database.

Prometheus Monitoring in TIBCO Cloud Integration

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

Usually, when you’re developing or running your container application you will get to a moment when something goes wrong. But not in a way you can solve with your logging system and with testing.

A moment when there is some bottleneck, something that is not performing as well as you want, and you’d like to take a look inside. And that’s what we’re going to do. We’re going to watch inside.

Because our BusinessWorks Container Edition provides so great features to do it that you need to use it into your favor because you’re going to thank me for the rest of your life. So, I don’t want to spend one more minute about this. I’d like to start telling you right now.

The first thing we need to do, we need to go inside the OSGi console from the container. So, the first thing we do is to expose the 8090 port as you can see in the picture below

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

Now, we can expose that port to your host, using the port-forward command

kubectl port-forward deploy/phenix-test-project-v1 8090:8090

And then we can execute an HTTP Request to execute any info using commands like this:

curl -v http://localhost:8090/bw/framework.json/osgi?command=<command>

And we’re are going to execute first the activation of the process statistics like this:

curl -v http://localhost:8090/bw/framework.json/osgi?command=startpsc
Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

And as you can see it says that statistics has been enabled for echo application, so using that application name we’re going to gather the statistics at the level

curl -v http://localhost:8090/bw/framework.json/osgi?command=lpis%20echo
Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

And you can see the statistics at the process level where you can see the following metrics:

  • Process metadata (name, parent process and version)
  • Total instance by status (create, suspended, failed and executed)
  • Execution time (total, average, min, max, most recent)
  • Elapsed time (total, average, min, max, most recent)

And we can get the statistics at the activity level:

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

And with that, you can detect any bottleneck you’re facing into your application and also be sure which activity or which process is responsible for it. So you can solve it in a quick way.

Have fun and use the tools at your disposal!

Kubernetes Service Discovery for Prometheus: Dynamic Scraping the Right Way

Kubernetes Service Discovery for Prometheus: Dynamic Scraping the Right Way

In previous posts, we described how to set up Prometheus to work with your TIBCO BusinessWorks Container Edition apps, and you can read more about it here.

In that post, we described that there were several ways to update Prometheus about the services that ready to monitor. And we choose the most simple at that moment that was the static_config configuration which means:

Don’t worry Prometheus, I’ll let you know the IP you need to monitor and you don’t need to worry about anything else.

And this is useful for a quick test in a local environment when you want to test quickly your Prometheus set up or you want to work in the Grafana part to design the best possible dashboard to handle your need.

But, this is not too useful for a real production environment, even more, when we’re talking about a Kubernetes cluster when services are going up & down continuously over time. So, to solve this situation Prometheus allows us to define a different kind of ways to perform this “service discovery” approach. In the official documentation for Prometheus, we can read a lot about the different service discovery techniques but at a high level these are the main service discovery techniques available:

  • azure_sd_configs: Azure Service Discovery
  • consul_sd_configs: Consul Service Discovery
  • dns_sd_configs: DNS Service Discovery
  • ec2_sd_configs: EC2 Service Discovery
  • openstack_sd_configs: OpenStack Service Discovery
  • file_sd_configs: File Service Discovery
  • gce_sd_configs: GCE Service Discovery
  • kubernetes_sd_configs: Kubernetes Service Discovery
  • marathon_sd_configs: Marathon Service Discovery
  • nerve_sd_configs: AirBnB’s Nerve Service Discovery
  • serverset_sd_configs: Zookeeper Serverset Service Discovery
  • triton_sd_configs: Triton Service Discovery
  • static_config: Static IP/DNS for the configuration. No Service Discovery.

And even, it all these options are not enough for you and need something more specific you have an API available to extend the Prometheus capabilities and create your own Service Discovery technique. You can find more info about it here:

But this is not our case, for us, the Kubernetes Service Discovery is the right choice for our approach. So, we’re going to change the static configuration we had in the previous post:

- job_name: 'bwdockermonitoring'
  honor_labels: true
  static_configs:
    - targets: ['phenix-test-project-svc.default.svc.cluster.local:9095']
      labels:
        group: 'prod'

For this Kubernetes configuration

- job_name: 'bwce-metrics'
  scrape_interval: 5s
  metrics_path: /metrics/
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - default
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: (.*)
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: prom
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: 1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: $1
    action: replace

As you can see this is quite more complex than the previous configuration but it is not as complex as you can think at first glance, let’s review it by different parts.

- role: endpoints
    namespaces:
      names:
      - default

It says that we’re going to use role for endpoints that are created under the default namespace and we’re going to specify the changes we need to do to find the metrics endpoints for Prometheus.

scrape_interval: 5s
 metrics_path: /metrics/
 scheme: http

This says that we’re going to execute the scrape process in a 5 seconds interval, using http on the path /metrics/

And then, we have a relabel_config section:

- source_labels: [__meta_kubernetes_service_label_app]
    separator: ;
    regex: (.*)
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: prom
    replacement: $1
    action: keep

That means that we’d like to keep that label for prometheus:

- source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: 1
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: $1
    action: replace

That means that we want to do a replace of the label value and we can do several things:

  • Rename the label name using the target_label to set the name of the final label that we’re going to create based on the source_labels.
  • Replace the value using the regex parameter to define the regular expression for the original value and the replacement parameter that is going to express the changes that we want to do to this value.

So, now after applying this configuration when we deploy a new application in our Kubernetes cluster, like the project that we can see here:

Automatically we’re going to see an additional target on our job-name configuration “bwce-metrics”

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!

Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!

Prometheus is becoming the new standard for Kubernetes monitoring and today we are going to cover how we can do Prometheus TIBCO monitoring in Kubernetes.

This article is part of my comprehensive TIBCO Integration Platform Guide where you can find more patterns and best practices for TIBCO integration platforms.

We’re living in a world with constant changes and this is even more true in the Enterprise Application world. I’ll not spend much time talking about things you already know, but just say that the microservices architecture approach and the PaaS solutions have been a game-changer for all enterprise integration technologies.

This time I’d like to talk about monitoring and the integration capabilities we have of using Prometheus to monitor our microservices developed under TIBCO technology. I don’t like to spend too much time either talking about what Prometheus is, as you probably already know, but in a summary, this is an open-source distributed monitoring platform that has been the second project released by the Cloud Native Computing Foundation (after Kubernetes itself) and that has been established as a de-facto industry standard for monitoring K8S clusters (alongside with other options in the market like InfluxDB and so on).

Prometheus has a lot of great features, but one of them is that it has connectors for almost everything and that’s very important today because it is so complicated/unwanted/unusual to define a platform with a single product for the PaaS layer. So today, I want to show you how to monitor your TIBCO BusinessWorks Container Edition applications using Prometheus.

Most of the info I’m going to share is available in the bw-tooling GitHub repo, so you can get to there if you need to validate any specific statement.

Ok, are we ready? Let’s start!!

I’m going to assume that we already have a Kubernetes cluster in place and Prometheus installed as well. So, the first step is to enhance the BusinessWorks Container Edition base image to include the Prometheus capabilities integration. To do that we need to go to the GitHub repo page and follow these instructions:

  • Download & unzip the prometheus-integration.zip folder.
  • Open TIBCO BusinessWorks Studio and point it to a new workspace.
  • Right-click in Project Explorer → Import… → select Plug-ins and Fragments → select Import from the directory radio button
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Browse it to prometheus-integration folder (unzipped in step 1)
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Now click Next → Select Prometheus plugin → click Add button → click Finish. This will import the plugin in the studio.
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Now, to create JAR of this plugin so first, we need to make sure to update com.tibco.bw.prometheus.monitor with ‘.’ (dot) in Bundle-Classpath field as given below in META-INF/MANIFEST.MF file.
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Right-click on Plugin → Export → Export…
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Select type as JAR file click Next
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Now Click Next → Next → select radio button to use existing MANIFEST.MF file and browse the manifest file
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!
  • Click Finish. This will generate prometheus-integration.jar

Now, with the JAR already created what we need to do is include it in your own base image. To do that we place the JAR file in the <TIBCO_HOME>/bwce/2.4/docker/resources/addons/jar

Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!

And we launch the building image command again from the <TIBCO_HOME>/bwce/2.4/docker folder to update the image using the following command (use the version you’re using at the moment)

docker build -t bwce_base:2.4.4 .

So, now we have an image with Prometheus support! Great! We’re close to the finish, we just create an image for our Container Application, in my case, this is going to be a very simple echo service that you can see here.

And we only need to keep these things in particular when we deploy to our Kubernetes cluster:

  • We should set an environment variable with the BW_PROMETHEUS_ENABLE to “TRUE”
  • We should expose the port 9095 from the container to be used by Prometheus to integrate.
Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!

Now, we only need to provide this endpoint to the Prometheus scrapper system. There are several ways to do that, but we’re going to focus on the simple one.

We need to change the prometheus.yml to add the following job data:

- job_name: 'bwdockermonitoring'
  honor_labels: true
  static_configs:
    - targets: ['phenix-test-project-svc.default.svc.cluster.local:9095']
      labels:
        group: 'prod'

And after restarting Prometheus we have all the data indexed in the Prometheus database to be used for any dashboard system.

Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!

In this case, I’m going to use Grafana to do quick dashboard.

Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!

Each of these graph components is configured based on the metrics that are being scraped by Prometheus TIBCO exporter.

Prometheus TIBCO Monitoring for Containers: Quick and Simple in 5 Minutes!