Skip to content

Commit c3c4ffa

Browse files
Merge pull request #27 from relativitydev/REL-1129659-EW-Documentation-Update-Troubleshooting
REL-1129659-EW Documentation Update Troubleshooting
2 parents a16306b + a4425d9 commit c3c4ffa

11 files changed

+1606
-17
lines changed

docs/troubleshooting.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Troubleshooting Guide
2+
3+
This document provides quick reference links to detailed troubleshooting guides for all components in the Relativity Server Bundle environment.
4+
5+
## Component Troubleshooting Guides
6+
7+
### [Elasticsearch Troubleshooting →](troubleshooting/elasticsearch.md)
8+
9+
### [Kibana Troubleshooting →](troubleshooting/kibana.md)
10+
11+
### [APM Server Troubleshooting →](troubleshooting/apm-server.md)
12+
13+
### [Environment Watch Monitoring Agent and Open Telemetry Collector Troubleshooting →](troubleshooting/monitoring-agent-and-otel-collector.md)
14+
15+
### [Relativity Server CLI Troubleshooting →](troubleshooting/relativity-server-cli.md)
16+
17+
### [Relativity Alerts Troubleshooting →](troubleshooting/relativity_alerts_troubleshooting.md)
18+
19+

docs/troubleshooting/apm-server.md

Lines changed: 316 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,316 @@
1+
# APM Server Troubleshooting
2+
3+
This document provides troubleshooting guidance for common APM Server issues encountered during installation, configuration, and operation in Relativity Server environments.
4+
5+
> [!NOTE]
6+
> This guide assumes a default APM Server installation path of `C:\elastic\apm-server`. Adjust paths according to your actual installation directory.
7+
8+
## Table of Contents
9+
10+
- [1. Windows Service Issues](#1-windows-service-issues)
11+
- [1.1 APM Server Service Not Starting](#11-apm-server-service-not-starting)
12+
- [1.2 Service Crashes or Stops Unexpectedly](#12-service-crashes-or-stops-unexpectedly)
13+
- [1.3 Permission and Access Issues](#13-permission-and-access-issues)
14+
- [2. Port Configuration Issues](#2-port-configuration-issues)
15+
- [2.1 Port Conflicts](#21-port-conflicts)
16+
- [2.2 Network Connectivity Problems](#22-network-connectivity-problems)
17+
- [3. Service Verification](#3-service-verification)
18+
- [3.1 Verifying APM Server Health and Status](#31-verifying-apm-server-health-and-status)
19+
- [4. Self-Instrumentation](#4-self-instrumentation)
20+
21+
---
22+
23+
## 1. Windows Service Issues
24+
25+
### 1.1 APM Server Service Not Starting
26+
27+
**Symptoms:**
28+
- APM Server service fails to start
29+
- Service stops immediately after starting
30+
- Error messages in APM Server logs
31+
32+
**Troubleshooting Steps:**
33+
34+
**Check APM Server Status:**
35+
```powershell
36+
Get-Service -Name apm-server
37+
```
38+
<details>
39+
<summary>Expected response</summary>
40+
41+
```
42+
Status Name DisplayName
43+
------ ---- -----------
44+
Stopped apm-server Elastic APM Server
45+
```
46+
</details>
47+
48+
**Verify Service Configuration:**
49+
```powershell
50+
(Get-CimInstance Win32_Service -Filter "Name = 'apm-server'").StartName
51+
```
52+
<details>
53+
<summary>Expected response</summary>
54+
55+
```
56+
LocalSystem
57+
```
58+
</details>
59+
60+
**Check APM Server Logs:**
61+
1. Navigate to `C:\Program Files\apm-server\logs\`
62+
2. Review the latest log files (`apm-server.log`) for error messages
63+
3. Look for configuration errors or connection issues with Elasticsearch
64+
65+
> [!NOTE]
66+
> For Elasticsearch connection issues, see [Elasticsearch Troubleshooting](elasticsearch.md)
67+
68+
**Verify Configuration File:**
69+
```powershell
70+
# Stop Windows service first, then test configuration syntax
71+
Stop-Service apm-server
72+
C:\elastic\apm-server\apm-server.exe test config -c "C:\elastic\apm-server\apm-server.yml"
73+
```
74+
<details>
75+
<summary>Expected response</summary>
76+
77+
```
78+
Config OK
79+
```
80+
</details>
81+
82+
**Start Service Manually:**
83+
```powershell
84+
Start-Service apm-server
85+
```
86+
87+
---
88+
89+
### 1.2 Service Crashes or Stops Unexpectedly
90+
91+
**Symptoms:**
92+
- APM Server service starts but stops after a short period
93+
- Service status shows "Stopped" unexpectedly
94+
- APM data collection stops working
95+
96+
**Troubleshooting Steps:**
97+
98+
* **Check APM Server Logs:**
99+
See above.
100+
101+
* **Review APM Server Configuration:**
102+
- Check `apm-server.yml` file in `C:\elastic\apm-server\`
103+
- Verify Elasticsearch connection settings (see [Elasticsearch Troubleshooting](elasticsearch.md) for detailed troubleshooting)
104+
- Common configuration issues:
105+
- **TLS**: Ensure correct protocol (`http` vs `https`)
106+
- **Hostname**: Verify correct Elasticsearch server hostname
107+
- **Port**: Confirm correct Elasticsearch port (usually 9200)
108+
109+
> [!NOTE]
110+
> API keys are the preferred authentication method and expire by default in 6 months. Consider switching from username/password to API key authentication. For API key creation, see [Kibana Troubleshooting](kibana.md).
111+
112+
```yaml
113+
output.elasticsearch:
114+
hosts: ["https://<hostname_or_ip>:9200"]
115+
api_key: "your-api-key-here"
116+
# OR (not recommended)
117+
# username: "<username>"
118+
# password: "<password>"
119+
```
120+
> [!NOTE]
121+
> This section in `apm-server.yml` configures how APM Server connects to your Elasticsearch cluster.
122+
> - `hosts`: The URL(s) of your Elasticsearch node(s).
123+
> - `api_key`: The recommended authentication method.
124+
> - `username`/`password`: Legacy authentication (not recommended; use API keys instead).
125+
> For instructions on creating an API key, see [Kibana Troubleshooting](kibana.md).
126+
127+
* **To verify the connection, run:**
128+
```powershell
129+
C:\elastic\apm-server\apm-server.exe test output -c "C:\elastic\apm-server\apm-server.yml"
130+
```
131+
<details>
132+
<summary>Expected output for successful connection</summary>
133+
134+
```
135+
elasticsearch: https://<hostname_or_ip>:9200...
136+
parse url... OK
137+
connection...
138+
parse host... OK
139+
dns lookup... OK
140+
addresses: fe80::61a7:3f3f:210:8d65%Ethernet 2, 10.0.2.2
141+
dial up... OK
142+
TLS...
143+
security... WARN server's certificate chain verification is disabled
144+
handshake... OK
145+
TLS version: TLSv1.3
146+
dial up... OK
147+
talk to server... OK
148+
version: 8.17.3
149+
```
150+
</details>
151+
152+
> [!NOTE]
153+
> To verify Elasticsearch connectivity, see [Elasticsearch Troubleshooting](elasticsearch.md).
154+
155+
---
156+
157+
### 1.3 Permission and Access Issues
158+
159+
**Symptoms:**
160+
- Access denied errors when starting service
161+
- Unable to write to log directories
162+
- Configuration file access errors
163+
164+
**Troubleshooting Steps:**
165+
166+
* The APM Server Windows service runs under Local System account by default.
167+
* Verify access to `C:\elastic\apm-server\` directory.
168+
* Check write permissions to `C:\Program Files\apm-server\logs\` directory.
169+
170+
---
171+
172+
## 2. Port Configuration Issues
173+
174+
### 2.1 Port Conflicts
175+
176+
**Symptoms:**
177+
- APM Server fails to bind to default port
178+
- "bind: address already in use" errors in logs
179+
- APM agents cannot connect to server
180+
181+
**Troubleshooting Steps:**
182+
183+
* **Check Default Port:**
184+
- Default APM Server port: 8200
185+
- Verify port availability:
186+
```powershell
187+
netstat -an | findstr ":8200"
188+
```
189+
<details>
190+
<summary>Expected response</summary>
191+
192+
```
193+
(No output if port is available. If you see LISTENING, port is in use.)
194+
```
195+
</details>
196+
197+
* **Identify Port Conflicts:**
198+
```powershell
199+
Get-NetTCPConnection -LocalPort 8200 -State Listen
200+
```
201+
<details>
202+
<summary>Expected response</summary>
203+
204+
```
205+
(No results if port is available. If results are returned, another process is using port 8200.)
206+
```
207+
</details>
208+
209+
> [!IMPORTANT]
210+
> Do not change the APM Server port. Instead, identify and stop the conflicting service using port 8200, as changing the APM Server port requires extensive configuration changes across Environment Watch, Relativity, and other components.
211+
212+
---
213+
214+
### 2.2 Network Connectivity Problems
215+
216+
**Symptoms:**
217+
- Service Not Running: APM Server or Elasticsearch may not be running or listening on the expected endpoints.
218+
- Incorrect Configuration: The APM Server or Elasticsearch endpoint URLs may be misconfigured (wrong host, port, or protocol).
219+
- Firewall Rules: Firewalls on the VM host or network may be blocking required ports (e.g., 9200 for Elasticsearch, 8200 for APM Server).
220+
221+
**Troubleshooting Steps:**
222+
223+
* **Verify Network Binding:**
224+
- Check `apm-server.yml` configuration:
225+
```yaml
226+
apm-server:
227+
host: "0.0.0.0:8200" # Listen on all interfaces
228+
# or
229+
host: "<hostname_or_ip>:8200"
230+
```
231+
232+
* **Test Remote Connectivity:**
233+
```powershell
234+
Test-NetConnection -ComputerName <hostname_or_ip> -Port 8200
235+
```
236+
<details>
237+
<summary>Expected response</summary>
238+
239+
```
240+
TcpTestSucceeded : True
241+
```
242+
</details>
243+
244+
---
245+
246+
## 3. Service Verification
247+
248+
### 3.1 Verifying APM Server Health and Status
249+
250+
**Symptoms:**
251+
- Need to confirm APM Server is operating correctly
252+
- Performance monitoring requirements
253+
- Health check automation
254+
255+
**Troubleshooting Steps:**
256+
257+
* **Verify Server Configuration:**
258+
```powershell
259+
C:\elastic\apm-server\apm-server.exe test config -c "C:\elastic\apm-server\apm-server.yml"
260+
```
261+
<details>
262+
<summary>Expected response</summary>
263+
264+
```
265+
Config OK
266+
```
267+
</details>
268+
269+
* **Check Elasticsearch Connection:**
270+
```powershell
271+
# Stop Windows service first, then test output connectivity
272+
Stop-Service apm-server
273+
C:\elastic\apm-server\apm-server.exe test output -c "C:\elastic\apm-server\apm-server.yml"
274+
```
275+
<details>
276+
<summary>Expected output for successful connection</summary>
277+
278+
```
279+
elasticsearch: https://<hostname_or_ip>:9200...
280+
parse url... OK
281+
connection...
282+
parse host... OK
283+
dns lookup... OK
284+
addresses: 192.168.1.100
285+
dial up... OK
286+
TLS... WARN secure connection disabled
287+
talk to server... OK
288+
```
289+
</details>
290+
291+
---
292+
293+
## 4. Self-Instrumentation
294+
295+
**Symptoms:**
296+
- Need to monitor APM Server itself for performance and health metrics
297+
- Want to enable self-monitoring and observability for the APM Server
298+
299+
**Troubleshooting Steps:**
300+
301+
> [!NOTE]
302+
> For detailed self-instrumentation configuration steps, see the [Development Tier Setup Guide](../elasticsearch_setup_development.md#step-3-install-and-configure-apm-server).
303+
304+
* **Enable Self-Instrumentation:**
305+
- Self-instrumentation allows APM Server to monitor its own performance
306+
- This feature is configured in the `apm-server.yml` file
307+
- Refer to Step 3 in the [elasticsearch_setup_development.md](../elasticsearch_setup_development.md) guide for complete configuration details
308+
309+
* **Verify Self-Instrumentation:**
310+
- After configuration, restart the APM Server service:
311+
```powershell
312+
Restart-Service apm-server
313+
```
314+
- Check Kibana to verify that APM Server metrics are being collected
315+
316+

0 commit comments

Comments
 (0)