• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Telemetry in distributed systems

 
Ranch Hand
Posts: 39
  • Number of slices to send:
    Optional 'thank-you' note:
For a distributed system like microservices, the telemetry can be generated by the applications running on a VM machine, baremetal machine, container, serverless function potentially in different parts of the world. This can be a challenge from the telemetry perspective - how to trace the flow of information for example. In your book, do you mention some best practices when it comes to these potential issues?
 
Author
Posts: 26
6
  • Number of slices to send:
    Optional 'thank-you' note:
Distributed systems have gotten a lot of attention lately, and it sometimes feels like parts of the community consider distributed systems the only way to do anything. Any microservice or distributed system has a lot of component parts, each with their own highly local problems and constraints. This is why the tracing telemetry type has exploded in the last five or so years, as it's best suited to providing telemetry for an entire ecosystem rather than a specific system.

This is where OpenTelemetry (OTel) comes into play. OTel is a telemetry format, a Shipping Stage technology, that relies on a service that can consume OTel formatted data to provide goodness. Jaeger is the most prevalent open source system supporting OTel, but SaaS providers like Honeycomb, New Relic, and Datadog all can consume it as well. In fact, many newer, microservcice-from-the-start companies don't bother with centralized logging, or even metrics and rely entirely on tracing!

That said, tracing is a rather heavy service to provide, so you're not going to get datastores with a year plus of telemetry in it without DIYing it or paying stupendous amounts of money to a SaaS provider. When companies start feeling this pressure, they start looking at dedicated metrics systems, since those handle long timelines far better.

Tracing is designed from the start to handle high cardinality. In fact, tracing as we know it today grew out of frustration over existing metrics systems and how bad they were at handling cardinality. Metrics systems are great for telling you how the system as a whole is behaving, and don't have the cardinality room to answer the "how did this one user session perform" or "what happened to this account at 1932 last Thursday" kind of questions. Tracing does, and goes even further to loop in centralized logging concepts to give the troubleshooter everything that happened in that workflow, regardless of the services it passed through.

But what if you need to provide metrics and centralized logging for a distributed system?

An intentionally designed centralized logging system can begin to provide the "everything that happened in that workflow" concept, but won't give you that wonderful waterfall graphic that shows you the timeline of execution split out by service. You can do this by agreeing as a product to pass around correlation identifiers as part of execution. Each step in a workflow accepts identifiers and passes those identifiers on to the next service, where those identifiers are used as fields in logging. Some common identifiers:

  • account_id - the account doing the action
  • session_id - the browser-session doing the action
  • request_id - the specific request that triggered the action, this will correlate all actions in the workflow
  • execution_id - the specific execution of a service

  • These all allow you to bring up all the logging for a specific series of events. Agree to this as a product of services, and you can make an existing centralized logging system far more useful than it would be otherwise. Strongly recommended.

    These identifiers would be great in a metrics system, but these IDs tend to be quite high cardinality. If your metrics system can handle it, perhaps you're using a newer time-series system backed by a columnar database, use them there too. For metrics systems, you mostly have to give up the idea of providing metrics for individual executions or accounts and focus on whole-system telemetry. Focus instead on providing a common schema for capturing the key concepts used by all of your microservices. Having a common schema allows you to get around the sheer flexibility of English that can wreak havoc in a time-series database. Consider the following attributes, they all capture the same thing but are phrased differently:

  • upload_size
  • file_size
  • size
  • size_upd
  • transfer_size

  • This is a classic example of asking five engineers to name the same thing and get five different answers. By having that agreed to schema you unify many of these concepts, which reduces cardinality burden on the database and makes it perform faster. Also, it enables engineers working on different services to have a better idea of how metrics for your service are shaped, which improves the maintainability of the overall system.
     
    With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
    reply
      Bookmark Topic Watch Topic
    • New Topic
    vceplus-200-125    | boson-200-125    | training-cissp    | actualtests-cissp    | techexams-cissp    | gratisexams-300-075    | pearsonitcertification-210-260    | examsboost-210-260    | examsforall-210-260    | dumps4free-210-260    | reddit-210-260    | cisexams-352-001    | itexamfox-352-001    | passguaranteed-352-001    | passeasily-352-001    | freeccnastudyguide-200-120    | gocertify-200-120    | passcerty-200-120    | certifyguide-70-980    | dumpscollection-70-980    | examcollection-70-534    | cbtnuggets-210-065    | examfiles-400-051    | passitdump-400-051    | pearsonitcertification-70-462    | anderseide-70-347    | thomas-70-533    | research-1V0-605    | topix-102-400    | certdepot-EX200    | pearsonit-640-916    | itproguru-70-533    | reddit-100-105    | channel9-70-346    | anderseide-70-346    | theiia-IIA-CIA-PART3    | certificationHP-hp0-s41    | pearsonitcertification-640-916    | anderMicrosoft-70-534    | cathMicrosoft-70-462    | examcollection-cca-500    | techexams-gcih    | mslearn-70-346    | measureup-70-486    | pass4sure-hp0-s41    | iiba-640-916    | itsecurity-sscp    | cbtnuggets-300-320    | blogged-70-486    | pass4sure-IIA-CIA-PART1    | cbtnuggets-100-101    | developerhandbook-70-486    | lpicisco-101    | mylearn-1V0-605    | tomsitpro-cism    | gnosis-101    | channel9Mic-70-534    | ipass-IIA-CIA-PART1    | forcerts-70-417    | tests-sy0-401    | ipasstheciaexam-IIA-CIA-PART3    | mostcisco-300-135    | buildazure-70-533    | cloudera-cca-500    | pdf4cert-2v0-621    | f5cisco-101    | gocertify-1z0-062    | quora-640-916    | micrcosoft-70-480    | brain2pass-70-417    | examcompass-sy0-401    | global-EX200    | iassc-ICGB    | vceplus-300-115    | quizlet-810-403    | cbtnuggets-70-697    | educationOracle-1Z0-434    | channel9-70-534    | officialcerts-400-051    | examsboost-IIA-CIA-PART1    | networktut-300-135    | teststarter-300-206    | pluralsight-70-486    | coding-70-486    | freeccna-100-101    | digitaltut-300-101    | iiba-CBAP    | virtuallymikebrown-640-916    | isaca-cism    | whizlabs-pmp    | techexams-70-980    | ciscopress-300-115    | techtarget-cism    | pearsonitcertification-300-070    | testking-2v0-621    | isacaNew-cism    | simplilearn-pmi-rmp    | simplilearn-pmp    | educationOracle-1z0-809    | education-1z0-809    | teachertube-1Z0-434    | villanovau-CBAP    | quora-300-206    | certifyguide-300-208    | cbtnuggets-100-105    | flydumps-70-417    | gratisexams-1V0-605    | ituonline-1z0-062    | techexams-cas-002    | simplilearn-70-534    | pluralsight-70-697    | theiia-IIA-CIA-PART1    | itexamtips-400-051    | pearsonitcertification-EX200    | pluralsight-70-480    | learn-hp0-s42    | giac-gpen    | mindhub-102-400    | coursesmsu-CBAP    | examsforall-2v0-621    | developerhandbook-70-487    | root-EX200    | coderanch-1z0-809    | getfreedumps-1z0-062    | comptia-cas-002    | quora-1z0-809    | boson-300-135    | killtest-2v0-621    | learncia-IIA-CIA-PART3    | computer-gcih    | universitycloudera-cca-500    | itexamrun-70-410    | certificationHPv2-hp0-s41    | certskills-100-105    | skipitnow-70-417    | gocertify-sy0-401    | prep4sure-70-417    | simplilearn-cisa    |
    http://www.pmsas.pr.gov.br/wp-content/    | http://www.pmsas.pr.gov.br/wp-content/    |