Common Issues
Solutions for frequently encountered problems.
Startup Issues
Service Fails to Start
Symptom: Application exits immediately after startup.
Possible Causes:
Nacos not reachable
Error: Failed to connect to Nacos serverSolution: Ensure Nacos is running on port 8848.
bashcurl http://localhost:8848/nacos/v1/console/health/livenessDatabase connection failed
Error: Communications link failureSolution: Verify MySQL is running and credentials are correct.
bashmysql -h <mysql-host> -u <mysql-user> -p RecordPlatformPort already in use
Error: Address already in use: 8000Solution: Kill the existing process or use a different port.
bashlsof -i :8000 kill -9 <PID>
Dubbo Service Registration Failed
Symptom: Provider services not visible in Nacos console.
Solution:
- Check Nacos address in configuration
- Verify network connectivity to Nacos
- Check Dubbo protocol port is not blocked
Authentication Issues
401 Unauthorized
Symptom: All API requests return 401.
Possible Causes:
JWT expired
- Tokens expire after the configured TTL
- Solution: Re-authenticate to get a new token
Invalid JWT_KEY
- If
JWT_KEYchanged after token issuance - Solution: Clear client tokens and re-login
- If
Missing Authorization header
- Request lacks
Authorization: Bearer <token> - Solution: Include the header in all authenticated requests
- Request lacks
403 Forbidden
Symptom: User authenticated but access denied.
Possible Causes:
Insufficient permissions
- User lacks required role/permission
- Solution: Check user's role assignments
Resource ownership
- Trying to access another user's resources
- Solution: Verify resource ownership or admin privileges
Storage Issues
File Upload Fails
Symptom: Upload returns error or times out.
Possible Causes:
S3 node offline
bash# Check node status curl http://localhost:8092/actuator/healthSolution: Restart the offline node or wait for failover.
Insufficient replicas
- Healthy nodes cannot satisfy effective quorum or degraded-write minimum replicas
- Solution: Ensure online nodes satisfy
storage.replication.quorumand are not belowstorage.degraded-write.min-replicas.
Bucket not exists
Error: The specified bucket does not existSolution: Create the bucket or check bucket name configuration.
File Download Fails
Symptom: Download returns 404 or corrupted data.
Possible Causes:
File not fully replicated
- Saga transaction incomplete
- Solution: Check saga status in database.
Shard corruption
- One or more shards are corrupted
- Solution: Trigger consistency repair via admin endpoint.
Node containing shard is offline
- Solution: Wait for node recovery or trigger rebalance.
Saga Transaction Stuck
Symptom: Upload shows as pending indefinitely.
Solution:
- Check
file_sagatable for status - Review saga step failures in
file_saga_step - Trigger manual compensation if needed:sql
UPDATE file_saga SET status = 'COMPENSATING' WHERE saga_id = '<id>' AND status = 'RUNNING';
Blockchain Issues
Connection Timeout
Symptom: Blockchain operations timeout after 30s.
Possible Causes:
FISCO node unreachable
bash# Test connectivity telnet 127.0.0.1 20200Solution: Verify node is running and network allows connection.
Certificate mismatch
- SDK certificates don't match node certificates
- Solution: Regenerate and deploy matching certificates.
Circuit breaker open
- Too many failures triggered circuit breaker
- Solution: Wait for half-open state or restart service.
Contract Call Failed
Symptom: Smart contract operations return errors.
Possible Causes:
Contract not deployed
Error: Contract address is nullSolution: Deploy contracts and update addresses in config.
Insufficient gas
- Transaction exceeds gas limit
- Solution: Increase gas limit in configuration.
Permission denied
- Caller not authorized for the contract
- Solution: Check contract ACL configuration.
Performance Issues
Slow API Response
Symptom: API requests take >5 seconds.
Diagnostic Steps:
Check database queries
bash# Enable slow query log SET GLOBAL slow_query_log = 'ON'; SET GLOBAL long_query_time = 1;Check connection pools
- Druid monitor:
/record-platform/druid/ - Look for connection wait times
- Druid monitor:
Check S3 node latency
- Review
s3_node_load_scoremetrics - Consider adding more nodes
- Review
High Memory Usage
Symptom: JVM heap usage >80%.
Solutions:
Increase heap size
bashJAVA_OPTS="-Xms4g -Xmx8g"Check for memory leaks
- Enable heap dumps on OOM
- Analyze with Eclipse MAT or VisualVM
Reduce concurrent uploads
- Limit
multipart.max-file-size - Reduce chunk buffer sizes
- Limit
Database Connection Pool Exhausted
Symptom: "Cannot acquire connection from pool" errors.
Solutions:
Increase pool size
yamlspring: datasource: druid: max-active: 100Fix connection leaks
- Check for unclosed connections in code
- Enable connection leak detection in Druid
Optimize slow queries
- Add missing indexes
- Review query execution plans
Redis Issues
Cache Miss Rate High
Symptom: Database queries for cached data.
Solutions:
Check Redis connectivity
bashredis-cli -h <redis-host> pingVerify TTL settings
- Ensure cache isn't expiring too quickly
Check memory eviction
- If Redis is evicting keys due to memory pressure
- Increase
maxmemoryor reduce cached data
Redis Connection Timeout
Symptom: Intermittent Redis timeouts.
Solutions:
Check network latency
bashredis-cli -h <redis-host> --latencyIncrease connection timeout
yamlspring: redis: timeout: 5000Use connection pooling
- Configure Lettuce pool settings
RabbitMQ Issues
Messages Not Being Consumed
Symptom: Queue depth keeps growing.
Solutions:
Check consumer is running
bashrabbitmqctl list_consumersCheck for consumer errors
- Review application logs for listener exceptions
Check queue bindings
bashrabbitmqctl list_bindings
Dead Letter Queue Growing
Symptom: DLQ has many messages.
Solutions:
Review message processing errors
- Check application logs for root cause
Reprocess DLQ messages
- Move messages back to main queue after fixing issue
Adjust retry settings
yamlspring: rabbitmq: listener: simple: retry: max-attempts: 5
Quota Issues
Upload Fails with QUOTA_EXCEEDED
Symptom: Upload rejected with error code 50013 (QUOTA_EXCEEDED).
Cause: The tenant/user has exceeded their storage quota and enforcement mode is ENFORCE.
Solution:
- Check current quota status via
GET /api/v1/files/quota - If in SHADOW mode, uploads are allowed but logged — verify
quota.enforcement-modein config - To increase quota limits, contact the admin or adjust
quota_policytable