Managed Container Platforms Explained: 3 Key Business Benefits You Can’t Ignore

Managed Container Platforms Explained: 3 Key Business Benefits You Can’t Ignore

Managed Container Platform provides advantages to any system inside any company. Take a look at the three critical ones.

Managed Container Platform is disrupting everything. We’re living in a time where development and the IT landscape are changing, new paradigms like microservices and containers seem to be out there for the last few years, and if we trust the reality that the blog posts and the articles show today, we’re all of the users already using them all the time.

Did you see any blog posts about how to develop a J2EE application running on your Tomcat server on-prem? Probably not. The most similar article should probably be how to containerize your tomcat-based application.

But do you know what? Most companies still are working that way. So even if all companies have a new digital approach in some departments, they also have other ones being more traditional.

So, that seems that we need to find a different way to translate the main advantages of a container-based platform to a kind of speech they can see and realize the tangible benefits they can get from there and have the “Hey, this can work for me!” kind of spirit.

1. You will get all components isolated and updated more quickly

That’s one of the great things about container-based platforms compared with previous approaches like application server-based platforms. When you have an application server cluster, you still have one cluster with several applications. So you usually do some isolation, keep related applications, provide independent infrastructure for the critical ones, and so on.

But even with that, at some level, the application continues to be coupled, so some issues with some applications could bring down another one that was not expected for business reasons.

With a container-based platform, you’re getting each application in its bubble, so any issue or error will affect that application and nothing more. Platform stability is a priority for all companies and all departments inside them. Just ask yourself: Do you want to end with those “domino’s chains” of failure? How much will your operations improve? How much will your happiness increase?

Additionally, based on the container approach, you will get smaller components. Each of them will do a single task providing a single capability to your business, which means that it will be much easier to update, test, and deploy in production. So that, in the end will generate more deployments into the production environment and reduce the time to market for your business capabilities.

You will be able to deploy faster and have more stable operations simultaneously.

2.- You will optimize the use of your infrastructure

Costs, everything is about costs. There are no single conversations with customers who are not trying to pay less for their infrastructure. So, let’s face it. We should be able to run operations in an optimized way. So, if our infrastructure cost is going higher, that needs to mean that our business increases.

Container-based platforms will allow optimizing infrastructure in two different ways. First, if using two main concepts: Elasticity and Infrastructure Sharing.

Elasticity is related because I’m only going to have the infrastructure I need to support the load I have at this moment. So, if the load increases, my infrastructure will increase to handle it, but after that moment goes away, it will go back to what it needs now after that peak happened.

Infrastructure sharing is about using each server’s part that is free to deploy other applications. Imagine a traditional approach where I have two servers for my set of applications. Probably I don’t have 100% usage of those servers because I need to have some spare computer to be able to act when the load increases. I probably have 60–70% percent. That means 30% free. If we have different departments with different systems, and each has its infrastructure 30% free, how much of our infrastructure are we just throwing away? How many dollars/euros/pounds are you just throwing off the window?

Container-based platforms don’t need specific tools or software installed on the platform to run a different kind of application. It is not required because everything resides inside the container, so I can use any free space to deploy other applications doing a more efficient usage of those.

3.- You will not need infrastructure for administration

Each system that is big enough has some resources dedicated to being able to manage it. However, even most of the recommended architectures recommend placing those components isolated from your runtime components to avoid any issue regarding administrator or maintenance that can affect your runtime workloads, which means specific infrastructure that you’re using for something that isn’t helping your business. Of course, you can explain to any business user that you need a machine to run that provides the capabilities required. But it is more complex than using additional infrastructure (and generating cost) to place other components that are not helping the business.

So, managed container platforms take that problem away because you’re going to provide the infrastructure you need to run your workloads, and you’re going to be given for free or such low fee the administration capabilities. And addition to that, you don’t even need to worry that administration features are always available and working fine because this is leverage to the provider itself.

Wrap up and next steps

As you can see, we describe very tangible benefits that are not industry-based or development focus. Of course, we can have so many more to add to this list, but these are the critical ones that affect any company in any industry worldwide. So, please, take your time to think about how these capabilities can help to improve your business. But not only that, take your time to quantify how that will enhance your business. How much can you save? How much can you get from this approach?

And when you have in front of you a solid business case based on this approach, you will get all the support and courage you need to move forward during that route!! So I wish you a peaceful transition!

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Harbor Registry Explained: Securing Container Images in Kubernetes and DevSecOps

Harbor Registry Explained: Securing Container Images in Kubernetes and DevSecOps

Learn how you can include Harbor registry in your DevSecOps toolset to increase the security and management on your container-based platform

With the transition to a more agile development process where the number of deployments has been increased in an exponential way. That situation has made it quite complex to keep pace to make sure we’re not just deploying code more often into production that provides the capabilities that are required by the business. But, also, at the same time, we’re able to do it securely and safely.

That need is leading toward the DevSecOps idea to include security as part of the DevOps culture and practices as a way to ensure safety from the beginning on development and across all the standard steps from the developer machine to the production environment.

Additional to that, because of the container paradigm we have a more polyglot approach with different kinds of components running on our platform using a different base image, packages, libraries, and so on. We need to make sure they’re still secure to use and we need tools to be able to govern that in a natural way. To help us on that duty is where components like Harbor help us to do that.

Harbor is a CNCF project at the incubator stage at the moment of writing this article, and it provides several capabilities regarding how to manage container images from a project perspective. It gives a project approach with its docker registry and also a chart museum if we’d like to use Helm Charts as part of our project development. But it includes security features too, and that’s the one that we’re going to cover in this article:

  • Vulnerabilities Scan: it allows you to scan all the docker images registered in the different repositories to check if they have vulnerabilities. It also provides automation during that process to make sure that every time we push a new image, this is scanned automatically. Also, it will enable defining policies to avoid pulling any image with vulnerabilities and also set the level of vulnerabilities (low, medium, high, or critical) that we’d like to tolerate it. By default, it comes with Clair as the default scanner, but you can introduce others as well.
  • Signed images: Harbor registry provides options to deploy notary as part of its components to be able to sign images during the push process to make sure that no modifications are done to that image
  • Tag Inmuttability and Retention Rules: Harbor registry also provides the option to define tag immutability and retention rules to make sure that we don’t have any attempt to replace images with others using the same tag.

Harbor registry is based on docker so you can run it locally using docker and docker-compose using the procedure that is available on its official web page. But it also supports being installed on top of your Kubernetes platform using the helm chart and operator that is available.

Once the tool is installed, we have access to the UI Web Portal, and we’re able to create a project that has repositories as part of it.

Harbor Registry: How to use to increase security on your platform?
Project List inside the Harbor Portal UI

As part of the project configuration, we can define the security policies that we’d like to provide to each project. That means that different projects can have different security profiles.

Harbor Registry: How to use to increase security on your platform?
Security settings inside a Project in Harbor Porta UI

And once we push a new image to the repository that belongs to that project we’re going to see the following details:

Harbor Registry: How to use to increase security on your platform?

In this case, I’ve pushed a TIBCO BusinessWorks Container Edition application that doesn’t contain any vulnerability and just shows that and also where this was checked.

Also, if we see the details, we can check additional information like if the image has been signed or not, or be able to check it again.

Harbor Registry: How to use to increase security on your platform?
Image details inside Harbor Portal UI

Summary

So, this is just a few features that Harbor provides from the security perspective. But Harbor is much more than only that so probably we cover more of its features in further articles I hope based on what you read today you’d like to give it a chance and start introducing it in your DevSecOps toolset.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

API Management vs Service Mesh: Differences, Use Cases, and When You Need Both

API Management vs Service Mesh: Differences, Use Cases, and When You Need Both

Service Mesh vs. API Management Solution: is it the same? Are they compatible? Are they rivals?

When we talk about communication in a distributed cloud-native world and especially when we are talking about container-based architectures based on Kubernetes platform like AKS, EKS, Openshift, and so on, two technologies generate a lot of confusion because they seem to be covering the same capabilities: Those are Service Mesh and API Management Solutions.

It is has been a controversial topic where different bold statements have been made: People who think that those technologies to work together in a complementary mode, others who believe that they’re trying to solve the same problems in different ways and even people who think one is just the evolution of the other to the new cloud-native architecture.

API Management Solutions

API Management Solutions have been part of our architectures for so long. It is a crucial component of any architecture nowadays that is created following the principles of the API-Led Architecture, and they’re an evolution of the pre-existent API Gateway we’ve included as an evolution of the pure proxies in the late 90s and early 2000.

API Management Solutions is a critical component of your API Strategy because it enable your company to work on an API Led Approach. And that is much more than the technical aspect of it. We usually try to simplify the API Led Approach to the technical side with the API-based development and the microservices we’re creating and the collaborative spirit in mind we use today to make any piece of software that is deployed on the production environment.

But it is pretty much more than that. API Lead Architectures is about creating products from our API, providing all the artifacts (technical and non-technical) that we need to do that conversion. A quick list of those artifacts (but it is not an exhaustive list are the following ones)

  • API Documentation Support
  • Package Plans Definition
  • Subscription capabilities
  • Monetization capabilities
  • Self-Service API Discovery
  • Versioning capabilities

Traditionally, the API Management solution also comes with API Gateway capabilities embedded to cover even the technical aspect of it, and that also provide some other capabilities more in the technical level:

  • Exposition
  • Routing
  • Security
  • Throttling

Service Mesh

Service Mesh is more a buzz word these days and a technology that is now trending because it has been created to solve some of the challenges that are inherent to the microservice and container approach and everything under the cloud-native label.

In this case, it comes from the technical side so, it is much more a bottom-top approach because their existence is to be able to solve a technical problem and try to provide a better user experience to the new developers and system administrators in this new world much more complicated. And what are the challenges that have been created in this transition? Let’s take a look at them:

Service Registry & Discovery is one of the critical things that we need to cover because with the elastic paradigm of the cloud-native world makes that the services are switching its location from time to time being started in new machines when needed, remove of them when there is no enough load to require its presence, so it is essential to provide a way to easily manage that new reality that we didn’t need in the past when our services were bounded to a specific machine or set of devices.

Security is another important topic in any architecture we can create today, and with the polyglot approach we’ve incorporated in our architectures is another challenging thing because we need to provide a secure way to communicate our services that are supported by any technology we’re using and anyone we can use in the future. And we’re not talking just about pure Authentication but also Authorization because in a service-to-service communication we also need to provide a way to check if the microservice that is calling another one is allowed to do so and do that in an agile way not to stop all the new advantages that your cloud-native architecture provides because of its conception.

Routing requirements also have been changed in these new architectures. If you remember how we usually deploy in traditional architectures, we typically try to find a zero down-time approach (when possible) but a very standard procedure. Deploy a new version, validate its working, and open the traffic for anyone, but today the requirements claim for much more complex paradigms. The Service Mesh technologies support rollout strategies like A/B Testing, Weight-based routing, Canary deployments.

Rival or Companion?

So, after doing a quick view of the purpose of these technologies and the problem they tried to solve, are they rivals or companions? Should we choose one or the other or try to place both of them in our architecture?

Like always, the answer to those questions is the same: “It depends!”. It depends on what you’re trying to do, what your company is trying to achieve, what you’re building..

  • API Management solution is needed as long as you’re implementing an API Strategy in your organization. Service Mesh technology is not trying to fill that gap. They can provide technical capabilities to cover that traditional has been done the API Gateway component, but this is just one of the elements of the API Management Solution. The other parts that provide the management and the governance capabilities are not covered by any Service Mesh today.
  • Service Mesh is needed if you have a cloud-native architecture based on the container platform that is firmly based on HTTP communication for synchronous communication. It provides so many technical capabilities that will make your life much more manageable that as soon as you include it into your architecture, you cannot live without it.
  • Service Mesh is only going to provide its capabilities in a container platform approach. So, if you have a more heterogeneous landscape as much of the enterprise do today, (you have a container platform but also other platforms like SaaS application, some systems still on-prem and traditional architectures that all of them are providing capabilities that you’d like to leverage as part of the API products), you will need to include an API Management Solution.

So, these technologies can play together in a complete architecture to cover different kinds of requirements, especially when we’re talking about complex heterogeneous-architectures with a need to include an API Lead approach.

In upcoming articles, we will cover how we can integrate both technologies from the technical aspect and how the data flow among the different components of the architecture.

Observability in Polyglot Microservice Architectures: Tracing Without Friction

Observability in Polyglot Microservice Architectures: Tracing Without Friction

Learn how to manage observability requirements as part of your microservice ecosystem

“May you live in interesting times” is the English translation of the Chinese curse, and this couldn’t be more true as a description of the times that we’re living regarding our application architecture and application development.

All the changes from the cloud-native approach, including all the new technologies that come with it like containers, microservices, API, DevOps, and so on has transformed the situation entirely for any architecture, developer, or system administration.

It’s something similar if you went to bed in 2003, and you wake up in 2020 all the changes, all the new philosophies, but also all the unique challenges that come with the changes and new capabilities are things that we need to deal with today.

I think we all can agree the present is polyglot in terms of application development. Today is not expected for any big company or enterprise to find an available technology, an available language to support all their in-house products. Today we all follow and agree on the “the right tool for the right job principle” to try to create our toolset of technologies that we are going to use to solve different use cases or patterns that you need to face.

But that agreement and movement also come with its challenge regarding things that we usually don’t think about like Tracing and Observability in general.

When we use a single technology, everything is more straightforward. To define a common strategy to trace your end to end flows is easy; you only need to embed the logic into your common development framework or library all your developments are using. Probably define a typical header architecture with all the data that you need to be able to effectively trace all the requests and define a standard protocol to send all those traces to a standard system that can store and correlate all of them and explain the end to end flow. But try to move that to a polyglot ecosystem: Should I write my framework or library for each language or technology I’d need to use, or I can also use in the future? Does that make sense?

But not only that, should I slow the adoption of a new technology that can quickly help the business because I need to provide from a shared team this kind of standard components? That is the best case that I have enough people that know how the internals of my framework work and have the skills in all the languages that we’re adopting to be able to do it quickly and in an efficient way. It seems unlikely, right?

So, to new challenges also new solutions. I’m already have been talking about Service Mesh regarding the capabilities that provide from a communication perspective, and if you don’t remember you can take a look at those posts:

But it also provides capabilities from other perspectives and Tracing, and Observability is one of them. Because when we cannot include those features in any technology, we need to use, we can do it in a general technology that is supported by all of them, and that’s the case with Service Mesh.

As Service Mesh is the standard way to communicate synchronously, your microservices in an east-west communication fashion covering the service-to-service communication. So, if you’re able to include in that component also the tracing capability, you can have an end-to-end tracing without needed to implement anything in each of the different technologies that you can use to implement the logic that you need, so, you’ve been changing from Figure A to Figure B in the picture below:

Observability in Polyglot Microservice Architectures: Tracing Without Friction
In-App Tracing logic implementation vs. Service Mesh Tracing Support

And that what most of the Service Mesh technologies are doing. For example, Istio, as one of the default choices when it comes to Service Mesh, includes an implementation of the OpenTracing standard that allows integration with any tool that supports the standard to be able to collect all the tracing information for any technology that is used to communicate across the mesh.

So, that mind-change allows us to easily integrates different technologies without needed any exceptional support of those standards for any specific technology. Does that mean that the implementation of those standards for those technologies is not required? Not at all, that is still relevant, because the ones that also support those standards can provide even more insights. After all, the Service Mesh only knows part of the information that is the flow that’s happening outside of each technology. It’s something similar to a black-box approach. But also adding the support for each technology to the same standard provides an additional white-box approach as you can see graphically in the image below:

Observability in Polyglot Microservice Architectures: Tracing Without Friction
Merging White Box Tracing Data and Black Box Tracing Data

We already talked about the compliance of some technologies with the OpenTracing standard like TIBCO BusinessWorks Container Edition that you can remember it here:

So, also, the support from these technologies of the industry standards is needed and even a competitive advantage because without needing to develop your tracing framework, you’re able to achieve a Complete Tracing Data approach additional to that is already provided by the Service Mesh level itself.

Rename Prometheus Metrics Using metric_relabel_configs (Change Metric Names Safely)

Rename Prometheus Metrics Using metric_relabel_configs (Change Metric Names Safely)

Find a way to re-define and re-organize the name of your Prometheus metrics to meet your requirements

Prometheus has become the new standard when we’re talking about monitoring our new modern application architecture, and we need to make sure we know all about its options to make sure we can get the best out of it. I’ve been using it for some time until I realized about a feature that I was desperate to know how to do, but I couldn’t find anywhere clearly define. So as I didn’t found it easily, I thought about writing a small article to show you how to do it without needed to spend the same time as I did.

We have plenty of information about how to configure Prometheus and use some of the usual configuration plugins, as we can see on its official webpage [1]. Even I already write about some configuration and using it for several purposes, as you can see also in other posts [2][3][4].

One of these configuration plugins is about relabeling, and this is a great thing. We have that each of the exporters can have its labels and meaning for those, and when you try to manage different technologies or components makes complex that all of them match together even if all of them follow the naming convention that Prometheus has [5].

But I had this situation, and I’m sure you have gone or will go towards that as well, that I have similar metrics for different technologies that for me are the same, and I need to keep them with the same name, but as they belong to other technologies they are not. So I need to find a way to rename the metric, and the great thing is that you can do that.

To do that, you just need to do a metric_relabel configuration. This configuration applies to relabel (as the name already indicates) labels of your prometheus metrics in this case before being ingested but also allow us to use some notable terms to do different things, and one of these notable terms is __name__. __name__ is a particular label that will enable you to rename your prometheus metrics before being ingested in the Prometheus Timeseries Database. And after that point, this will be as it will have that name since the beginning.

How to use that is relatively easy, is as any other relabel process, and I’d like to show you a sample about how to do it.

- source_labels: [__name__]
regex:  'jvm_threads_current'
target_label: __name__
replacement: 'process_thread_count'

Here it is a simple sample to show how we can rename a metric name jvm_threads_current to count the threads inside the JVM machine to do it more generic to be able to include the threads for the process in a process_thread_count prometheus metrics that we can use now as it was the original name.


References

[1] Prometheus: Configuration https://prometheus.io/docs/prometheus/latest/configuration/configuration/

[2] https://medium.com/@alexandrev/prometheus-monitoring-in-tibco-cloud-integration-96a6811416ce

[3] https://medium.com/@alexandrev/prometheus-monitoring-for-microservices-using-tibco-772018d093c4

[4] https://medium.com/@alexandrev/kubernetes-service-discovery-for-prometheus-fcab74237db6

[5] Prometheus: Metric and Label Naming https://prometheus.io/docs/practices/naming/

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

Apache Kafka seems to be the standard solution in nowadays architecture, but we should focus if it is the right choice for our needs.

Nowadays, we’re in a new age of Event-Driven Architecture, and this is not the first time we’ve lived that. Before microservices and cloud, EDA was the new normal in enterprise integration. Based on different kinds of standards, there where protocols like JMS or AMQP used in broker-based products like TIBCO EMS, Active MQ, or IBM Websphere MQ, so this approach is not something new.

With the rise of microservices architectures and the API lead approach, it seemed that we’ve forgotten about the importance of the messaging systems, and we had to go through the same challenges we saw in the past to come to a new messaging solution to solve that problem. So, we’re coming back to EDA Architecture, pub-sub mechanism, to help us decouple the consumers and producers, moving from orchestration to choreography, and all these concepts fit better in nowaday worlds with more and more independent components that need cooperation and integration.

During this effort, we started to look at new technologies to help us implement that again. Still, with the new reality, we forgot about the heavy protocols and standards like JMS and started to think about other options. And we need to admit that we felt that there is a new king in this area, and this is one of the critical components that seem to be no matter what in today’s architecture: Apache Kafka.

And don’t get me wrong. Apache Kafka is fantastic, and it has been proven for so long, a production-ready solution, performant, with impressive capabilities for replay and powerful API to ease the integration. Apache Kafka has some challenges in this cloud-native world because it doesn’t play so well with some of its rules.

If you have used Apache Kafka for some time, you are aware that there are particular challenges with it. Apache Kafka has an architecture that comes from its LinkedIn days in 2011, where Kubernetes or even Docker and container technologies were not a thing, that makes to run Apache Kafka (purely stateful service) in a container fashion quite complicated. There are improvements using helm charts and operators to ease the journey, but still, it doesn’t feel like pieces can integrate well into that fashion. Another thing is the geo-replication that even with components like MirrorMaker, it is not something used, works smooth, and feels integrated.

Other technologies are trying to provide a solution for those capabilities, and one of them is also another Apache Foundation project that has been donated by Yahoo! and it is named Apache Pulsar.

Don’t get me wrong; this is not about finding a new truth, that single messaging solution that is perfect for today’s architectures: it doesn’t exist. In today’s world, with so many different requirements and variables for the different kinds of applications, one size fits all is no longer true. So you should stop thinking about which messaging solution is the best one, and think more about which one serves your architecture best and fulfills both technical and business requirements.

We have covered different ways for general communication, with several specific solutions for synchronous communication (service mesh technologies and protocols like REST, GraphQL, or gRPC) and different ones for asynchronous communication. We need to go deeper into the asynchronous communication to find what works best for you. But first, let’s speak a little bit more about Apache Pulsar.

Apache Pulsar

Apache Pulsar, as mentioned above, has been developed internally by Yahoo! and donated to the Apache Foundation. As stated on their official website, they are several key points to mention as we start exploring this option:

  • Pulsar Functions: Easily deploy lightweight compute logic using developer-friendly APIs without needing to run your stream processing engine
  • Proven in production: Apache Pulsar has run in production at Yahoo scale for over three years, with millions of messages per second across millions of topics
  • Horizontally scalable: Seamlessly expand capacity to hundreds of nodes
  • Low latency with durability: Designed for low publish latency (< 5ms) at scale with strong durability guarantees
  • Geo-replication: Designed for configurable replication between data centers across multiple geographic regions
  • Multi-tenancy: Built from the ground up as a multi-tenant system. Supports Isolation, Authentication, Authorization, and Quotas
  • Persistent storages: Persistent message storage based on Apache BookKeeper. Provides IO-level isolation between write and read operations
  • Client libraries: Flexible messaging models with high-level APIs for Java, C++, Python and GO
  • Operability: REST Admin API for provisioning, administration, tools, and monitoring. Deploy on bare metal or Kubernetes.

So, as we can see, in its design, Apache Pulsar is addressing some of the main weaknesses of Apache Kafka as Geo-replication and their cloud-native approach.

Apache Pulsar provides support for the pub/sub pattern, but also provides so many capabilities that also place as a traditional queue messaging system with their concept of exclusive topics where only one of the subscribers will receive the message. Also provides interesting concepts and features used in other messaging systems:

  • Dead Letter Topics: For messages that were not able to be processed by the consumer.
  • Persistent and Non-Persistent Topics: To decide if you want to persist your messages or not during the transition.
  • Namespaces: To have a logical distribution of your topics, so an application can be grouped in namespaces as we do, for example, in Kubernetes so we can isolate some applications from the others.
  • Failover: Similar to exclusive, but when the attached consumer failed to process another takes the chance to process the messages.
  • Shared: To be able to provide a round-robin approach similar to the traditional queue messaging system where all the subscribers will be attached to the topic, but the only one will receive the message, and it will distribute the load along all of them.
  • Multi-topic subscriptions: To be able to subscribe to several topics using a regexp (similar to the Subject approach from TIBCO Rendezvous, for example, in the 90s) that has been so powerful and popular.

But also, if you require features from Apache Kafka, you will still have similar concepts as partitioned topics, key-shared topics, and so on. So you have everything at your hand to choose which kind of configuration works best for you and your specific use cases, you also have the option to mix and match.

Apache Pulsar Architecture

Apache Pulsar Architecture is similar to other comparable messaging systems today. As you can see in the picture below from the Apache Pulsar website, those are the main components of the architecture:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense
  • Brokers: One or more brokers handles incoming messages from producers, dispatches messages to consumers
  • BookKeeper Cluster for persistent storage of messages management
  • ZooKeeper Cluster for management purposes.

So you can see this architecture is also quite similar to the Apache Kafka one again with the addition of a new concept of the BookKeeper Cluster.

Broker in Apache Pulsar are stateless components that mainly will run two pieces

  • HTTP Server that exposes a REST API for management and is used by consumers and producers for topic lookup.
  • TCP Server using a binary protocol called dispatcher that is used for all the data transfers. Usually, Messages are dispatched out of a managed ledger cache for performance purposes. But also if this cache grows too big, it will interact with the BookKeeper cluster for persistence reasons.

To support the Global Replication (Geo-Replication), the Brokers manage replicators that tail the entries published in the local region and republish them to the remote regions.

Apache BookKeeper Cluster is used as persistent message storage. Apache BookKeeper is a distributed write-ahead log (WAL) system that manages when messages should be persisted. It also supports horizontal scaling based on the load and multi-log support. Not only messages are persistent but also the cursors that are the consumer position for a specific topic (similar to the offset in Apache Kafka terminology)

Finally, Zookeeper Cluster is used in the same role as Apache Kafka as a metadata configuration storage cluster for the whole system.

Hello World using Apache Pulsar

Let’s see how we can create a quick “Hello World” case using Apache Pulsar as a protocol, and to do that, we’re going to try to implement it in a cloud-native fashion. So we will do a single-node cluster of Apache Pulsar in a Kubernetes installation and deploy a producer application using Flogo technology and a consumer application using Go. Something similar to what you can see in the diagram below:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense
Diagram about the test case we’re doing

And we’re going to try to keep it simple, so we will just use pure docker this time. So, first of all, just spin up the Apache Pulsar server and to do that we will use the following command:

docker run -it -p 6650:6650 -p 8080:8080 --mount source=pulsardata,target=/pulsar/data --mount source=pulsarconf,target=/pulsar/conf apachepulsar/pulsar:2.5.1   bin/pulsar standalone

And we will see an output similar to this one:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

Now, we need to create simple applications, and for that, Flogo and Go will be used.

Let’s start with the producer, and in this case, we will use the open-source version to create a quick application.

First of all, we will just use the Web UI (dockerized) to do that. Run the command:

docker run -it -p 3303:3303 flogo/flogo-docker eula-accept

And we install a new contribution to enable the Pulsar publisher activity. To do that we will click on the “Install new contribution” button and provide the following URL:

flogo install github.com/mmussett/flogo-components/activity/pulsar

And now we will create a simple flow as you can see in the picture below:

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

We will now build the application using the menu, and that’s it!

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

To be able to run just launch the application as you can see here:

./sample-app_linux_amd64

Now, we just need to create the Go-lang consumer to be able to do that we need to install the golang package:

go get github.com/apache/pulsar-client-go/pulsar

And now we need to create the following code:

package main
import (
 “fmt”
 “log”
 “github.com/apache/pulsar-client-go/pulsar”
)
func main() {
 client, err := pulsar.NewClient(pulsar.ClientOptions{URL: “pulsar://localhost:6650”})
 if err != nil {
 log.Fatal(err)
 }
defer client.Close()
channel := make(chan pulsar.ConsumerMessage, 100)
options := pulsar.ConsumerOptions{
 Topic: “counter”,
 SubscriptionName: “my-subscription”,
 Type: pulsar.Shared,
 }
options.MessageChannel = channel
consumer, err := client.Subscribe(options)
 if err != nil {
 log.Fatal(err)
 }
defer consumer.Close()
// Receive messages from channel. The channel returns a struct which contains message and the consumer from where
 // the message was received. It’s not necessary here since we have 1 single consumer, but the channel could be
 // shared across multiple consumers as well
 for cm := range channel {
 msg := cm.Message
 fmt.Printf(“Received message msgId: %v — content: ‘%s’\n”,
 msg.ID(), string(msg.Payload()))
consumer.Ack(msg)
}
}

And after running both programs, you can see the following output as you can see, we were able to communicate both applications in an effortless flow.

Apache Pulsar vs Apache Kafka: Architecture, Cloud-Native Design, and When Pulsar Makes Sense

This article is just a starting point, and we will continue talking about how to use Apache Pulsar in your architectures. If you want to take a look at the code we’ve used in this sample, you can find it here:

Kubernetes Batch Processing with TIBCO BusinessWorks: Jobs, Patterns, and Use Cases

Kubernetes Batch Processing with TIBCO BusinessWorks: Jobs, Patterns, and Use Cases
Table Of Contents

Add a header to begin generating the table of contents

We all know that in the rise of the cloud-native development and architectures, we’ve seen Kubernetes based platforms as the new standard all focusing on new developments following the new paradigms and best practices: Microservices, Event-Driven Architectures new shiny protocols like GraphQL or gRPC, and so on and so forth.

This article is part of my comprehensive TIBCO Integration Platform Guide where you can find more patterns and best practices for TIBCO integration platforms.

!– /wp:paragraph –>

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

Usually, when you’re developing or running your container application you will get to a moment when something goes wrong. But not in a way you can solve with your logging system and with testing.

A moment when there is some bottleneck, something that is not performing as well as you want, and you’d like to take a look inside. And that’s what we’re going to do. We’re going to watch inside.

Because our BusinessWorks Container Edition provides so great features to do it that you need to use it into your favor because you’re going to thank me for the rest of your life. So, I don’t want to spend one more minute about this. I’d like to start telling you right now.

The first thing we need to do, we need to go inside the OSGi console from the container. So, the first thing we do is to expose the 8090 port as you can see in the picture below

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

Now, we can expose that port to your host, using the port-forward command

kubectl port-forward deploy/phenix-test-project-v1 8090:8090

And then we can execute an HTTP Request to execute any info using commands like this:

curl -v http://localhost:8090/bw/framework.json/osgi?command=<command>

And we’re are going to execute first the activation of the process statistics like this:

curl -v http://localhost:8090/bw/framework.json/osgi?command=startpsc
Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

And as you can see it says that statistics has been enabled for echo application, so using that application name we’re going to gather the statistics at the level

curl -v http://localhost:8090/bw/framework.json/osgi?command=lpis%20echo
Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

And you can see the statistics at the process level where you can see the following metrics:

  • Process metadata (name, parent process and version)
  • Total instance by status (create, suspended, failed and executed)
  • Execution time (total, average, min, max, most recent)
  • Elapsed time (total, average, min, max, most recent)

And we can get the statistics at the activity level:

Detect Performance Bottlenecks in TIBCO BusinessWorks Container Edition Using Statistics

And with that, you can detect any bottleneck you’re facing into your application and also be sure which activity or which process is responsible for it. So you can solve it in a quick way.

Have fun and use the tools at your disposal!