Linux下coredump调试2:实例

前面文章只是给出简单演示,实际的程序运行中会遇到这样或那样的问题。所以,本文结合笔者实际编程经历,给出一些曾经遇到过的实际例子。
笔者遇到的大多数程序崩溃原因,基本上都是段错误:非法内存使用,越界。这就要在程序编码中注意代码的质量了。比如使用指针前必须先判断其合法性,释放指针后及时将指针置为NULL,使用数组注意不能超出其范围,等等。

指针非法

下面的例子是笔者前段时间进行的onvif程序的片段。调试过程如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
GNU gdb (Ubuntu 7.7-0ubuntu3.1) 7.7
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from a.out...done.
[New LWP 22255]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
Core was generated by `./a.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x083e95b2 in WsddProxyImpl::discoverDevices (this=0xa617628, ip=0x0) at src/onvifwsddProxyImpl.cpp:131
131 matchDevice.Scopes = resp.wsdd__ProbeMatches->ProbeMatch->Scopes->__item;
(gdb) bt
#0 0x083e95b2 in WsddProxyImpl::discoverDevices (this=0xa617628, ip=0x0) at src/onvifwsddProxyImpl.cpp:131
#1 0x083b3d41 in OnvifClient::test (this=0xa5eb008, ip=0x0) at src/onvifClient.cpp:65
#2 0x083b3a0f in main_cpp (argc=1, argv=0xbfcc7ec4) at main.cpp:27
#3 0x083b39c8 in main (argc=1, argv=0xbfcc7ec4) at main.cpp:17

从gdb信息中看到出现问题地方为resp.wsdd__ProbeMatches->ProbeMatch->Scopes->__item,这么多级的指针,可能是中间某个指针非法,所以应该在代码中逐级判断。 实际原因:resp.wsdd__ProbeMatches->ProbeMatch->Scopes指针为空。 注:笔者在实际工作中就遇到一个RTSP模块使用多级指针但不做判断的情况,由于接手的程序庞大又不熟悉架构,而且还是在特定方案中出现,所以排查起来很麻烦,所幸用coredump还是能定位到问题所在。

vector使用方式

下面的例子同样是onvif程序,是vector使用方式不恰当导致。

1
2
3
4
5
6
7
8
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x083deb46 in DeviceBindingProxyImpl::getServices (this=0xa383ff8, services=...) at src/onvifDeviceBindingProxyImpl.cpp:68
68 services_[i].VersionMajor = resp.Service[i]->Version->Major;
(gdb) bt
#0 0x083deb46 in DeviceBindingProxyImpl::getServices (this=0xa383ff8, services=...) at src/onvifDeviceBindingProxyImpl.cpp:68
#1 0x083af56a in OnvifClient::test (this=0xa33a008, ip=0xbf88d91e "172.18.45.16") at onvifClient.cpp:101
#2 0x083af024 in main_cpp (argc=2, argv=0xbf88bdd4) at main.cpp:24
#3 0x083aefc4 in main (argc=2, argv=0xbf88bdd4) at main.cpp:10

代码片段:

1
2
3
4
5
6
7
8
9
10
11
12
13
for (unsigned int i = 0; i < resp.Service.size(); i++)
{
odt__Service tmp;
tmp.Namespace = resp.Service[i]->Namespace; // 先给临时变量存储
tmp.XAddr = resp.Service[i]->XAddr; // 先给临时变量存储
if (resp.Service[i]->Version)
{
// 此处出现错误,在vector没有确定容器大小时,不能这样搞。。。
services_[i].VersionMajor = resp.Service[i]->Version->Major;
services_[i].VersionMinor = resp.Service[i]->Version->Minor;
}
services_.push_back(tmp); // push_back,vector会自动增长
}

注:std::vector为空情况下,不能直接用[i].XXX的方法来赋值,可以调用resize()预先设置大小。不过最好的是使用push_back。
后续可能也许会不定时更新本文。
李迟 2015.5.31 周二 晚