Skip to main content

Chapter 1.5: ROS 2 Architecture

Learning Objectives

By the end of this chapter, you will be able to:

  • Explain the core concepts of ROS 2 (nodes, topics, services, actions, parameters)
  • Understand the ROS 2 architecture and how components interact
  • Identify the differences between ROS 1 and ROS 2 architectures
  • Describe the DDS (Data Distribution Service) middleware in ROS 2
  • Set up a basic ROS 2 workspace and run simple nodes
  • Understand Quality of Service (QoS) policies in ROS 2
  • Recognize the role of ROS 2 in a complete robotic system

Introduction

The Robot Operating System 2 (ROS 2) is the backbone of modern robotic software development. Unlike traditional operating systems like Linux or Windows, ROS 2 is a middleware that provides libraries and tools to help you build robot applications. It handles communication between different components of your robot, allowing you to focus on developing specific behaviors rather than managing inter-process communication.

Think of ROS 2 as a communication framework that connects various parts of your robot - the perception system, planning system, control system, and hardware interfaces. When your robot's camera detects an object, that information travels through ROS 2 to the planning system which decides how to approach it, then the plan goes to the control system which moves the robot.

In this chapter, we'll explore the fundamental architecture of ROS 2 and how its core concepts enable the construction of complex robotic systems. Understanding this architecture is crucial for building any substantial robotic application, as it determines how different components communicate and coordinate with each other.

1. What is ROS 2 (400 words)

Evolution from ROS 1 to ROS 2 (200 words)

The original Robot Operating System (ROS) was developed in 2007 to address the fragmentation in robotics software development. Before ROS, each laboratory and company built their own communication, visualization, and development tools from scratch.

ROS became popular but had limitations for production robotics:

  • Single point of failure (master-based architecture)
  • Centralized communication model
  • Limited security capabilities
  • Real-time limitations

ROS 2, released in 2015, addressed these concerns by:

  • Decentralizing the architecture (no master required)
  • Implementing modern middleware (DDS)
  • Adding security and real-time capabilities
  • Supporting multiple programming languages

ROS 2 Goals (200 words)

ROS 2 was designed with specific goals for production robotics:

Reliability: ROS 2 is designed for robots operating in real-world environments where failures have consequences. The decentralized architecture eliminates the single point of failure from ROS 1.

Security: With the rise of connected robots, security became paramount. ROS 2 includes built-in encryption and authentication.

Real-time Performance: Many robotic applications require predictable timing. ROS 2 supports real-time systems with appropriate configuration.

Commercial Deployment: ROS 2 aims for industrial and consumer applications, requiring support for a wider range of hardware and software configurations.

Standardization: ROS 2 uses standards like DDS (Data Distribution Service) rather than custom protocols, making integration with existing systems easier.

note

ROS 2 is not backward compatible with ROS 1. While the concepts are similar, the implementation details differ significantly. ROS 1 reached End of Life in May 2025.

2. Core Concepts (1000 words)

2.1 Nodes (200 words)

Nodes are the basic units of computation in ROS 2. Each node is an autonomous process that performs a specific task and communicates with other nodes through ROS 2 topics, services, actions, and parameters.

[Camera Node] ----> [Image Processing Node]
|
v
[Object Recognition Node]

Characteristics of Nodes:

  • Each node runs as a separate process
  • Nodes can run on different machines
  • Each node can be written in a different programming language (Python, C++, etc.)
  • Nodes are namespaced to avoid naming conflicts
  • Nodes can be started, stopped, and monitored individually

2.2 Topics and Publish-Subscribe (300 words)

Topics enable one-way, asynchronous communication between nodes using a publish-subscribe pattern.

How Topics Work:

  • Publishers send messages to a topic
  • Subscribers receive messages from a topic
  • Multiple nodes can publish to the same topic
  • Multiple nodes can subscribe to the same topic
  • Message delivery is "fire and forget" - no acknowledgment or guarantee
# Publisher code example
publisher = node.create_publisher(String, 'chatter', 10)
msg = String()
msg.data = 'Hello World'
publisher.publish(msg)
# Subscriber code example
def listener_callback(msg):
node.get_logger().info('I heard: "%s"' % msg.data)

subscriber = node.create_subscription(
String, 'chatter', listener_callback, 10)

Use Cases for Topics:

  • Sensor data distribution (camera images, laser scans)
  • Robot state broadcasting (joint angles, position)
  • Continuous control commands
  • Event notifications

Conceptual Diagram:

       [Sensor Node] 
|
v
/topic/sensor_data
|
v
[Filter Node] [Display Node] [Log Node]

2.3 Services and Request-Response (250 words)

Services enable synchronous, two-way communication between nodes using a request-response pattern.

How Services Work:

  • A service server waits for requests
  • A service client sends a request and waits for a response
  • Communication is blocking - client waits for response before continuing
  • Ensures request-response relationship
# Service Server
from example_interfaces.srv import AddTwoInts

def add_two_ints(request, response):
response.sum = request.a + request.b
return response

service = node.create_service(AddTwoInts, 'add_two_ints', add_two_ints)
# Service Client
client = node.create_client(AddTwoInts, 'add_two_ints')
request = AddTwoInts.Request()
request.a = 2
request.b = 3
future = client.call_async(request)

Use Cases for Services:

  • Configuration changes
  • Triggering one-time actions
  • Requesting specific information
  • Synchronous command execution

2.4 Actions (200 words)

Actions provide a way to communicate about long-running tasks with feedback and the ability to cancel.

How Actions Work:

  • Similar to services but for long-running tasks
  • Provides feedback during execution
  • Supports goal preemption (cancellation)
  • Built on top of topics and services

Components of Actions:

  • Goal: Request to start an action
  • Feedback: Status updates during execution
  • Result: Final outcome of the action

Use Cases for Actions:

  • Navigation to distant locations
  • Arm movement to a specific pose
  • Image processing with progress updates
  • Any long-running task requiring intervention capability

2.5 Parameters (50 words)

Parameters are key-value pairs that configure node behavior. They can be set at startup or changed during runtime.

3. DDS (Data Distribution Service) Architecture (800 words)

What is DDS? (200 words)

Data Distribution Service (DDS) is a middleware standard that ROS 2 uses for communication. Unlike ROS 1's custom TCPROS/RosTCP protocols, DDS provides a standardized, industry-proven communication fabric.

DDS is specified by the Object Management Group (OMG) and is used in industries requiring high reliability like defense, aviation, and medical devices.

Key DDS Concepts:

  • Data-Centricity: DDS focuses on the data itself rather than the communicating applications
  • Discovery: Automatic detection of nodes with matching topics/types
  • Quality of Service (QoS): Configurable policies for real-time performance and reliability
  • Built-in Security: Encryption and authentication capabilities
  • Language Independence: Supported across multiple programming languages

How DDS Enables ROS 2 Features (300 words)

Decentralized Architecture:

  • DDS doesn't require a central master like ROS 1
  • Nodes discover each other automatically over the network
  • If one node fails, others continue operating normally
  • Multiple nodes can run on the same machine or distributed systems

Quality of Service (QoS):

  • Publishers and subscribers declare QoS policies
  • Policies specify reliability, durability, liveliness, etc.
  • DDS matches nodes with compatible QoS requirements
  • Critical for real-time robotics applications

Network Discovery:

  • Nodes automatically find each other using DDS discovery protocols
  • Supports unicast and multicast communication
  • Handles nodes joining/leaving the network dynamically
  • Works across different machines with minimal configuration

Language Independence:

  • Each DDS implementation provides language bindings
  • Nodes in different languages can communicate seamlessly
  • All standard DDS implementations support C, C++, Java, C#, and more
  • ROS 2 extends this with Python and other language bindings

DDS Implementation in ROS 2 (300 words)

ROS 2 doesn't implement DDS itself but uses DDS implementations:

Common DDS Implementations:

  • Fast DDS (formerly Fast RTPS): Developed by Real-Time Systems Lab, now part of Eclipse Foundation
  • Cyclone DDS: Developed by ADLINK and Embedded Systems Institute
  • RTI Connext DDS: Commercial solution by RTI
  • OpenSplice DDS: Open-source implementation by ADLINK

ROS 2 - DDS Mapping:

  • Each ROS 2 concept maps to DDS concepts:
    • Topics → DDS Topics
    • Publishers → DDS Writers
    • Subscribers → DDS Readers
    • Services → DDS Requests/Replies
    • Actions → DDS with custom patterns
    • Parameters → DDS with custom implementation

Middleware Communication:

ROS 2 Application Code
|
v
ROS 2 Client Library (RCL)
|
v
DDS Implementation (FastDDS, CycloneDDS, etc.)
|
v
Transport Protocol (UDP/IP, Shared Memory, etc.)

The middleware abstraction allows ROS 2 to work with different DDS implementations transparently.

info

ROS 2's use of DDS allows it to benefit from decades of real-time systems research and industrial validation, making it suitable for safety-critical and commercial applications.

4. Quality of Service (QoS) Profiles (500 words)

What are QoS Profiles? (150 words)

Quality of Service (QoS) profiles allow ROS 2 users to specify the reliability requirements for communication between nodes. This is essential for robotic applications where some data (like safety-critical information) needs guaranteed delivery while other data (like high-frequency sensor streams) can tolerate occasional loss.

QoS profiles are sets of policies that govern:

  • How messages are delivered
  • How failures are handled
  • How old messages are treated
  • How network resources are used

Key QoS Policies (250 words)

Reliability Policy:

  • RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT: Deliver messages if possible (for sensor data)
  • RMW_QOS_POLICY_RELIABILITY_RELIABLE: Guarantee delivery of all messages (for commands)

Durability Policy:

  • RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL: Late-joining subscribers get recent messages
  • RMW_QOS_POLICY_DURABILITY_VOLATILE: No historical messages sent to late joiners

History Policy:

  • RMW_QOS_POLICY_HISTORY_KEEP_LAST: Store the last N messages
  • RMW_QOS_POLICY_HISTORY_KEEP_ALL: Store all messages

Liveliness Policy:

  • RMW_QOS_POLICY_LIVELINESS_AUTOMATIC: Based on message publication
  • RMW_QOS_POLICY_LIVELINESS_MANUAL_BY_TOPIC: Manually declared liveliness

Deadline Policy:

  • Specifies how frequently data should be published

Lease Duration:

  • Time between liveliness checks

Example QoS Usage (100 words)

# High-frequency sensor data (can lose some messages)
qos_sensor = qos_profile_sensor_data

# Critical command data (must arrive reliably)
qos_reliable = QoSProfile(
depth=10,
reliability=ReliabilityPolicy.RELIABLE,
durability=DurabilityPolicy.VOLATILE
)

5. Practical Implementation (600 words)

Workspace Structure (200 words)

A typical ROS 2 workspace has this structure:

workspace/
├── src/
│ ├── package1/
│ │ ├── CMakeLists.txt
│ │ ├── package.xml
│ │ ├── src/
│ │ ├── include/
│ │ └── test/
│ └── package2/
├── build/
├── install/
└── log/

Key Directories:

  • src/: Source code packages
  • build/: Build artifacts
  • install/: Installation directory after build
  • log/: Log files

Creating a Basic Node (400 words)

Let's create a simple ROS 2 node:

Step 1: Create the package

cd ~/ros2_ws/src
ros2 pkg create --build-type ament_python my_robot_talker
cd my_robot_talker

Step 2: Create the Python script (my_robot_talker/my_robot_talker/talker.py)

#!/usr/bin/env python3

import rclpy
from rclpy.node import Node
from std_msgs.msg import String


class TalkerNode(Node):

def __init__(self):
super().__init__('talker')
self.publisher = self.create_publisher(String, 'chatter', 10)
timer_period = 0.5 # seconds
self.timer = self.create_timer(timer_period, self.timer_callback)
self.i = 0

def timer_callback(self):
msg = String()
msg.data = f'Hello Robot World: {self.i}'
self.publisher.publish(msg)
self.get_logger().info(f'Publishing: "{msg.data}"')
self.i += 1


def main(args=None):
rclpy.init(args=args)
talker = TalkerNode()

try:
rclpy.spin(talker)
except KeyboardInterrupt:
pass
finally:
talker.destroy_node()
rclpy.shutdown()


if __name__ == '__main__':
main()

Step 3: Update setup.py

from setuptools import find_packages, setup

package_name = 'my_robot_talker'

setup(
name=package_name,
version='0.0.0',
packages=find_packages(exclude=['test']),
data_files=[
('share/ament_index/resource_index/packages',
['resource/' + package_name]),
('share/' + package_name, ['package.xml']),
],
install_requires=['setuptools'],
zip_safe=True,
maintainer='Your Name',
maintainer_email='your.email@example.com',
description='Simple ROS 2 talker',
license='Apache License, Version 2.0',
tests_require=['pytest'],
entry_points={
'console_scripts': [
'talker = my_robot_talker.talker:main',
],
},
)

Step 4: Build and run

cd ~/ros2_ws
colcon build --packages-select my_robot_talker
source install/setup.bash
ros2 run my_robot_talker talker
tip

When creating ROS 2 packages, always use meaningful names and include proper package.xml with dependencies, maintainers, and license information.

6. Hands-On Exercise (400 words)

Exercise: Node Communication with Different QoS

Objective: Create two nodes that communicate with different QoS settings to observe the reliability differences.

Prerequisites:

  • ROS 2 Humble Hawksbill installed
  • Python 3.8+ environment
  • Basic understanding of Python

Steps:

Step 1: Create a New Package

cd ~/ros2_ws/src
ros2 pkg create --build-type ament_python qos_demo
cd qos_demo

Step 2: Create Publisher with Different QoS (qos_demo/qos_demo/publisher.py)

#!/usr/bin/env python3

import rclpy
from rclpy.node import Node
from rclpy.qos import QoSProfile, ReliabilityPolicy, HistoryPolicy
from std_msgs.msg import String


class QoSPublisher(Node):

def __init__(self):
super().__init__('qos_publisher')

# Create a QoS profile with BEST_EFFORT reliability
qos_best_effort = QoSProfile(
depth=10,
reliability=ReliabilityPolicy.BEST_EFFORT,
history=HistoryPolicy.KEEP_LAST,
)

# Create a QoS profile with RELIABLE policy
qos_reliable = QoSProfile(
depth=10,
reliability=ReliabilityPolicy.RELIABLE,
history=HistoryPolicy.KEEP_LAST,
)

# Publisher with BEST_EFFORT
self.best_effort_pub = self.create_publisher(
String, 'best_effort_topic', qos_best_effort)

# Publisher with RELIABLE
self.reliable_pub = self.create_publisher(
String, 'reliable_topic', qos_reliable)

# Timer to send messages every 0.1 seconds
self.timer = self.create_timer(0.1, self.timer_callback)
self.counter = 0

def timer_callback(self):
# Publish to both topics
msg = String()
msg.data = f'Message {self.counter}'

self.best_effort_pub.publish(msg)
self.reliable_pub.publish(msg)

self.get_logger().info(f'Sent: "{msg.data}" to both topics')
self.counter += 1


def main(args=None):
rclpy.init(args=args)
publisher = QoSPublisher()

try:
rclpy.spin(publisher)
except KeyboardInterrupt:
pass
finally:
publisher.destroy_node()
rclpy.shutdown()


if __name__ == '__main__':
main()

Step 3: Create Subscriber that Matches QoS (qos_demo/qos_demo/subscriber.py)

#!/usr/bin/env python3

import rclpy
from rclpy.node import Node
from rclpy.qos import QoSProfile, ReliabilityPolicy, HistoryPolicy
from std_msgs.msg import String


class QoSSubscriber(Node):

def __init__(self):
super().__init__('qos_subscriber')

# Subscribe with matching QoS for best effort
qos_best_effort = QoSProfile(
depth=10,
reliability=ReliabilityPolicy.BEST_EFFORT,
history=HistoryPolicy.KEEP_LAST,
)

# Subscribe with matching QoS for reliable
qos_reliable = QoSProfile(
depth=10,
reliability=ReliabilityPolicy.RELIABLE,
history=HistoryPolicy.KEEP_LAST,
)

self.best_effort_sub = self.create_subscription(
String, 'best_effort_topic', self.best_effort_callback, qos_best_effort)
self.reliable_sub = self.create_subscription(
String, 'reliable_topic', self.reliable_callback, qos_reliable)

self.get_logger().info('QoS Subscriber is ready.')

def best_effort_callback(self, msg):
self.get_logger().info(f'Received BEST EFFORT: "{msg.data}"')

def reliable_callback(self, msg):
self.get_logger().info(f'Received RELIABLE: "{msg.data}"')


def main(args=None):
rclpy.init(args=args)
subscriber = QoSSubscriber()

try:
rclpy.spin(subscriber)
except KeyboardInterrupt:
pass
finally:
subscriber.destroy_node()
rclpy.shutdown()


if __name__ == '__main__':
main()

Step 4: Update setup.py to include subscriber

Add the subscriber to entry_points in setup.py:

entry_points={
'console_scripts': [
'publisher = qos_demo.publisher:main',
'subscriber = qos_demo.subscriber:main',
],
},

Step 5: Build and Test

cd ~/ros2_ws
colcon build --packages-select qos_demo
source install/setup.bash

# Terminal 1: Start publisher
ros2 run qos_demo publisher

# Terminal 2: Start subscriber
ros2 run qos_demo subscriber

Expected Result: You should see messages being received on both topics, but with different reliability characteristics. This demonstrates how QoS policies affect communication in ROS 2.

Troubleshooting:

  • If you get import errors, make sure ROS 2 is properly sourced
  • If nodes can't communicate, check that QoS profiles match between publisher and subscriber
  • If commands fail, verify your workspace path is correct

Extension Challenge (Optional)

Modify the subscriber to work with mismatched QoS profiles (e.g., try subscribing to the reliable topic with BEST_EFFORT policy) and observe what happens to communication.

7. Assessment Questions (10 questions)

Multiple Choice (5 questions)

Question 1: What does DDS stand for in the context of ROS 2? a) Data Distribution System b) Data Distribution Service c) Dynamic Data System d) Distributed Data Service

Details

Click to reveal answer Answer: b Explanation: DDS stands for Data Distribution Service, which is the middleware standard that ROS 2 uses for communication.

Question 2: Which ROS 2 communication pattern is suitable for long-running tasks with feedback? a) Topics b) Services c) Actions d) Parameters

Details

Click to reveal answer Answer: c Explanation: Actions are designed for long-running tasks that provide feedback during execution and can be cancelled, making them ideal for tasks like navigation or manipulation.

Question 3: What is the main advantage of ROS 2 over ROS 1 in terms of architecture? a) More programming languages b) No centralized master node c) Better visualization tools d) Faster message transport

Details

Click to reveal answer Answer: b Explanation: ROS 2 eliminated the single point of failure by removing the centralized master node, creating a decentralized architecture.

Question 4: Which QoS policy would be most appropriate for sending critical safety commands? a) BEST_EFFORT b) RELIABLE c) VOLATILE d) TRANSIENT_LOCAL

Details

Click to reveal answer Answer: b Explanation: RELIABLE QoS policy ensures that all messages are delivered, which is critical for safety commands.

Question 5: What is the primary purpose of ROS 2 parameters? a) To send temporary data between nodes b) To configure node behavior c) To store large amounts of sensor data d) To synchronize node clocks

Details

Click to reveal answer Answer: b Explanation: Parameters are key-value pairs used to configure node behavior, allowing for runtime adjustments without code changes.

Short Answer (3 questions)

Question 6: Explain the difference between ROS 2 topics and services, providing an example use case for each.

Details

Click to reveal sample answer Topics enable asynchronous, one-way communication through publish-subscribe pattern, suitable for continuous data like sensor readings. Services enable synchronous, request-response communication, suitable for discrete actions like triggering a calibration or requesting specific information.

Question 7: Why is DDS (Data Distribution Service) important for ROS 2's commercial deployment?

Details

Click to reveal sample answer DDS provides standardized, industry-proven middleware with features essential for commercial applications including real-time performance guarantees, built-in security with encryption/authentication, reliable communication with Quality of Service profiles, and language independence that supports integration with existing systems.

Question 8: Describe when you would use the "BEST_EFFORT" versus "RELIABLE" QoS policy in a robotic system.

Details

Click to reveal sample answer BEST_EFFORT is appropriate for high-frequency data where occasional packet loss is acceptable, such as sensor streams (camera images, LIDAR scans), live video feeds, or status updates. RELIABLE is used for critical data where every message must be delivered, such as safety commands, configuration changes, or control commands where missing a message could cause issues.

Practical Exercises (2 questions)

Question 9: Design Exercise Design the ROS 2 node architecture for a mobile robot that navigates a warehouse to pick up objects. Specify:

  1. What nodes you would create and their responsibilities
  2. What topics, services, and actions each node would use
  3. What QoS profiles you would use for each communication channel
  4. How the nodes would coordinate to accomplish the task

Question 10: Troubleshooting Exercise A ROS 2 system is working but experiencing intermittent communication issues between a camera driver node and the perception processing node. The camera publishes images at 30 Hz but the perception node sometimes misses frames, leading to inconsistent detection performance.

Analyze the potential causes and propose solutions:

  1. What QoS policies might be incorrectly configured?
  2. How could buffer depth settings affect this issue?
  3. What diagnostic tools would you use to verify the communication?
  4. What changes would you recommend to ensure reliable communication for the perception system?

8. Further Reading (5-7 resources)

  1. "Programming Robots with ROS" - Morgan Quigley, Brian Gerkey, William Smart Why read: Comprehensive guide to ROS 2 programming with practical examples Link: https://www.oreilly.com/library/view/programming-robots-with/9781449323899/

  2. ROS 2 Documentation Why read: Official documentation with latest updates and detailed tutorials Link: https://docs.ros.org/en/humble/

  3. DDS Tutorial by Real-Time Innovations (RTI) Why read: Understanding the underlying DDS technology Link: https://community.rti.com/rti-docs

  4. "Effective Robotics Programming with ROS" - Anis Koubaa Why read: Best practices and design patterns for ROS systems Link: https://www.packtpub.com/product/effective-robotics-programming-with-ros-third-edition/9781787281933

  5. ROS 2 Design Papers Why read: Understand the theoretical foundations of ROS 2 Link: https://design.ros2.org/

  6. Fast DDS Documentation Why read: Understand the default DDS implementation used by ROS 2 Link: https://fast-dds.docs.eprosima.com/en/v2.7.0/

  7. ROSCon Proceedings Why read: Latest advancements and real-world applications of ROS 2 Link: https://vimeo.com/roscon

  1. Start with the official ROS 2 documentation for foundational knowledge
  2. Read "Programming Robots with ROS" for practical examples
  3. Explore DDS documentation to understand the middleware
  4. Review ROSCon talks to see real-world applications

9. Hardware/Software Requirements

Software Requirements:

  • Ubuntu 22.04 LTS (recommended)
  • ROS 2 Humble Hawksbill
  • Python 3.8+ or C++ compiler
  • Git for version control
  • Colcon for building packages

Hardware Requirements:

  • Computer with 4+ GB RAM (8+ GB recommended)
  • Multi-core processor (modern Intel/AMD processor)
  • Network access (for tutorials and package installation)
  • (Optional) Any robotic platform for hardware testing

10. Chapter Summary & Next Steps

Chapter Summary

In this chapter, you learned:

  • The evolution of ROS from version 1 to 2 and the motivations
  • Core ROS 2 concepts: nodes, topics, services, actions, and parameters
  • How DDS middleware enables the ROS 2 architecture
  • Quality of Service (QoS) profiles and their applications
  • How to create basic ROS 2 nodes and workspaces
  • The role of ROS 2 in a complete robotic system architecture

Next Steps

In Chapter 1.6, we'll dive deeper into the communication patterns, exploring the publisher-subscriber paradigm in detail, and learning how to implement services and actions. This builds on the architectural foundation you've established here, preparing you to build more complex robotic systems with multiple coordinated components.


Estimated Time to Complete: 2.5 hours Difficulty Level: Intermediate Prerequisites: Chapters 1.1-1.4