Basics of Windows Kernel Debugging
In this blog, we explain you the Basics of Windows Kernel Debugging to troubleshoot the Windows Blue Screen Of Death (BSOD).
In IT world, whether you are a developer writing and debugging your own device drivers or debugging someone else code. Kernel debugging is crucial to your job. Need to understand how the device driver is interacting with the operating system and fix any unhandled conditions.
If you are an IT professional managing windows systems, Need to ensure the server uptime also identifying and fixing the problems. The kernel debugging skills will help to quickly determine why your system is crashing or hanging and find appropriate hotfix to address the issue.
The definition of crash dump analysis is “ The Art of Interpreting the contents of RAM to determine why a system failed ”. We need to find the clues from the memory dump and need to analysis who done it.
DIFFERENCE BETWEEN SYSTEM CRASHES vs SYSTEM HANGS
- System Crashes are unexpected, unhandled exceptions that occur in kernel mode. The kernel mode is the most privileged OS mode.
- The state of the operating system becomes questionable
- The operating system stops to avoid corrupting data.
- It also is known as Blue Screen Of Death (BSOD), and bug check.
- When the crash occurs the contents of RAM will automatically be saved in the file called Memory.dmp.
- A system hang occurs when processors or peripheral devices become unresponsive and the system appears lockup and freeze.
- Most system hangs are caused by faulty softwares or depleted memory resources. A buggy application may have the memory leak or deadlock that cause the cascading effect eventually hangs the entire system.
- System hangs can also occur because of hardware components like bad processors, corrupted RAM.
- Memory dump must be manually forced to generate on the hung system.
COMMON CAUSE FOR SYSTEM CRASHES
- The biggest reason for system crash and hangs are due to old anti-virus software. It happens when OS is upgraded but anti-virus drivers or not. The changes between operating system will also make changes in the data structures and functions. So old device drivers attempt to access the data will result in the operating system inconsistency.
- New device Drivers will also cause the OS crash or hangs. It will happen when old technology drivers replace with the new one. New drivers are suspects to bugs, unlike mature drivers.
- Incompatible Drivers is another top crasher in windows. It’s common to have one driver depend on other. When we update a driver you may not aware of its dependencies.
- Too many Filter Drivers will also cause the system crash. Filter drivers are used to performing additional functions and checks. For ex: Antivirus driver, disk mirroring, etc.., Each driver shares amount of stack space in the kernel. When the stack space runs out, the system crashes with StackOverflow error.
- The toughest type of crash is memory corruption. It’s because the corruption is detected when the memory pool allocated or freed long after the problem application exits.
- The occasional Hardware Failures and Operating system bugs will cause the system crash and hangs.
COMMON CAUSE FOR SYSTEM HANGS
- The system hangs typically cause due to depleted system resources like the memory pool.
- The another reason is buggy application will cause deadlocks or spinlock hung
- High-priority applications will also make other applications unresponsive.
- Old antivirus software, New drivers, Incompatible drivers & broken hardware will also cause system hangs.
MEMORY DUMP OVERVIEW
- When system crash or hang, the contents of RAM written to pagefile.sys which normally using for paging activity.
- After the system reboots, pagefile will get copied to memory.dmp file.
- At this point, we can utilize the windows debugger also known as windbg to analyze the crash dump.
- Finally, debugger connects the symbol server across the internet from Microsoft to translate the memory dump.
Note : In Windows 2008/vista later, a dedicated dump file can be used instead of pagefile on any local disk
TYPES OF MEMORY DUMPS
There are 5 types of memory dumps available in the windows environment. They are Small memory dump, kernel memory dump, Complete memory dump, Automatic memory dump & Active memory dump.
- Small memory dump – A small (64k) mini-dump containing minimal debugging information such as stop code, parameters, stack, drivers. It’s primarily used to provide the history of crash footprints to determine the same crash is occurring again or it’s different one.
- Kernel memory dump – The more useful memory dump is the kernel summary dump. It provides the vast majority of kernel debugging information which helps us to solve the crash. The size of memory dump is few hundred megabytes to several gigabytes depending on memory usage. It’s the recommended dump file setting.
- Complete memory dump – As the name implies it contains complete contents of memory. It takes a considerable amount of disk space and time to dump memory.
- Automatic memory dump – New type is introduced in Windows Server 2012 and Windows 8. Same as kernel memory dump, but uses a smaller page file when system managed page file are used. This setting is useful for machines with SSD and lots of RAM.
- Active Memory Dump – New type is introduced in Windows Server 2016 and Windows 10. It’s same as complete memory dump, but it does not include file cache, guest VM pages.
- For more details, check these URL’s: https://support.microsoft.com/en-in/help/254649/overview-of-memory-dump-file-options-for-windows and https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/active-memory-dump
DEDICATED MEMORY DUMP
- We can configure dedicated dump file in Windows Server 2008 and later rather than rely on memory page file to capture system memory dump.
- Prior to Windows Server 2008, the main restrictions to capture the memory dump is that pagefile has to exist on the system disk. Should have sufficient size to capture memory dump.
- In the later versions, Microsoft eliminates to reliance on the page file.
- This feature provides greater flexibility in managing crash dump files and page files. We can evenly distribute the pagefile across multiple local disks to achieve the best performance.
- Dedicated dump file must be located on any local disk. It should not be located on the SAN-based drive.
- This feature is specially used for very large memory systems.
Thanks for reading this blog. We hope it was useful for you to learn the basic about windows kernel debugging.