Skip to content

Many improvements to die interpretations given our explorations into high_pc #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 22, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,3 +255,23 @@ The following flags are not currently in use or will undergo heavy changes as th
- `[orc_test_flags]`: A series of runtime settings to pass to the test app for this test.

- `[orc_flags]`: A series of runtime settings to pass to the ORC engine for this test.

# Appendix A: Destructor Implementations

It has been observed that the destructor of a given class can be different sizes across translation units. This is because the Itanium ABI defines [several destructor types](https://itanium-cxx-abi.github.io/cxx-abi/abi.html#vague-ctor) which may contribute to the confusion. [Mark Rowe](https://www.linkedin.com/in/bdash/) provides an excellent synopsis:

> Presumably since you're seeing these destructors in multiple object files, they are declared in the header. The definitions have what is referred to as "vague linkage". For Mach-O this typically means they're emitted as weak definitions.
>
> At link time, the linker will use a strong definition for the symbol if it exists, otherwise it'll pick one of the weak definitions to use. If the symbol is not exported from the binary it'll be converted to a strong definition, otherwise it'll remain weak and the dynamic loader will do the same resolution process at load time (strong definition if one exists, otherwise pick one of the available weak definitions).
>
> If all of the various definitions are equivalent from an ABI point of view, it should not matter if they compile to slightly different code. However, if they are not ABI compatible, you'll have bugs that can be very hard to track down.
>
> `-fomit-frame-pointer` being used in some translation units and not others will result in code being generated for the same function. Different optimization levels will as well. Those should still be ABI-compatible though.
>
> Things like class members or virtual functions that are conditionally included or can change types based on `#if`s are a common source of problems.
>
> The other thing to be aware of is that for classes with virtual member functions, the compiler will often generate two destructors: the regular destructor, and the so-called "deleting" destructor. The deleting destructor is effectively a call to the regular destructor followed by a call to the appropriate `operator delete` implementation. If you're not distinguishing between these two types of destructors that may lead you to believe they're different sizes.
>
> The different destructor types are described in the Itanium ABI and can be distinguished via their mangled names.

(This appendix should be kept around until there is reasonable confidence that ORC is discerning between the various types and minimizing false positives.)
38 changes: 38 additions & 0 deletions include/orc/dwarf_constants.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@ enum class at : std::uint16_t {
abstract_origin = 0x31,
accessibility = 0x32,
address_class = 0x33,
// DW_AT_artificial attribute indicates that the associated entity (e.g., a function, variable, or parameter)
// is compiler-generated rather than explicitly written by the programmer in the source code.
artificial = 0x34,
base_types = 0x35,
calling_convention = 0x36,
Expand All @@ -114,9 +116,27 @@ enum class at : std::uint16_t {
decl_column = 0x39,
decl_file = 0x3a,
decl_line = 0x3b,
// DW_AT_declaration indicates that the associated entity is a declaration rather than a definition.
// A function declaration is typically represented as a DW_TAG_subprogram entry with the attribute
// DW_AT_declaration set to true (or 1).
// It does not have attributes like DW_AT_low_pc or DW_AT_high_pc, as it does not correspond to actual code.
// Example:
// <1><0x0000003a> DW_TAG_subprogram
// DW_AT_name ("myFunction")
// DW_AT_declaration (true)

// A function implementation is also represented as a DW_TAG_subprogram entry but does not have the DW_AT_declaration attribute.
// Instead, it includes attributes like DW_AT_low_pc and DW_AT_high_pc (or DW_AT_ranges),
// which specify the address range of the function's code in memory.
// Example:
// <1><0x0000003a> DW_TAG_subprogram
// DW_AT_name ("myFunction")
// DW_AT_low_pc (0x0000000000401000)
// DW_AT_high_pc (0x0000000000401020)
declaration = 0x3c,
discr_list = 0x3d,
encoding = 0x3e,
// DW_AT_external attribute indicates that the corresponding entity (e.g., a variable, function, or type) has external linkage.
external = 0x3f,
frame_base = 0x40,
friend_ = 0x41,
Expand All @@ -125,6 +145,10 @@ enum class at : std::uint16_t {
namelist_item = 0x44,
priority = 0x45,
segment = 0x46,
// DW_AT_specification attribute is used to reference a declaration of an entity
// (such as a function, variable, or type) that is defined elsewhere.
// It essentially links the current entry to its corresponding declaration,
// which is typically represented by a DW_TAG_subprogram, DW_TAG_variable, or similar tag.
specification = 0x47,
static_link = 0x48,
type = 0x49,
Expand Down Expand Up @@ -525,6 +549,20 @@ enum class tag : std::uint16_t {

const char* to_string(tag t);

/**
* @brief Determines if a given DWARF tag represents a type
*
* This function classifies whether a given DWARF tag represents a type definition
* or declaration. This is used to identify type-related DIEs in the DWARF debug
* information.
*
* @param t The DWARF tag to check
*
* @return true if the tag represents a type, false otherwise
*
* @pre The tag must be a valid DWARF tag
* @post The return value will be true for all type-related tags and false for all others
*/
bool is_type(tag t);

/**************************************************************************************************/
Expand Down
69 changes: 61 additions & 8 deletions include/orc/dwarf_structs.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@
#include <string>
#include <vector>

// adobe contract checks
#include "adobe/contract_checks.hpp"

// application
#include "orc/dwarf_constants.hpp"
#include "orc/hash.hpp"
Expand Down Expand Up @@ -54,7 +57,7 @@ struct attribute_value {
}

auto uint() const {
assert(has(type::uint));
ADOBE_PRECONDITION(has(type::uint));
return _uint;
}

Expand All @@ -64,22 +67,35 @@ struct attribute_value {
}

auto sint() const {
assert(has(type::sint));
ADOBE_PRECONDITION(has(type::sint));
return _int;
}

// Return _either_ sint or uint; some attributes
// may be one or the other, but in some cases the
// valid values could be represented by either type
// (e.g., the number cannot be negative or larger
// than the largest possible signed value.)
// This routine is useful when the caller doesn't
// care how it was stored and just wants the value.
// If this attribute value has _both_, it is assumed
// they are equal.
int number() const {
return has(type::sint) ? sint() : uint();
}

void string(pool_string x) {
_type |= type::string;
_string = x;
}

const auto& string() const {
assert(has(type::string));
ADOBE_PRECONDITION(has(type::string));
return _string;
}

auto string_hash() const {
assert(has(type::string));
ADOBE_PRECONDITION(has(type::string));
return _string.hash();
}

Expand All @@ -89,7 +105,7 @@ struct attribute_value {
}

auto reference() const {
assert(has(type::reference));
ADOBE_PRECONDITION(has(type::reference));
return _uint;
}

Expand Down Expand Up @@ -146,6 +162,7 @@ struct attribute {
auto reference() const { return _value.reference(); }
const auto& string() const { return _value.string(); }
auto uint() const { return _value.uint(); }
auto sint() const { return _value.sint(); }
auto string_hash() const { return _value.string_hash(); }
};

Expand Down Expand Up @@ -181,7 +198,7 @@ struct attribute_sequence {

bool has(dw::at name, enum attribute_value::type t) const {
auto [valid, iterator] = find(name);
return valid ? iterator->has(t) : false;
return valid && iterator->has(t);
}

bool has_uint(dw::at name) const {
Expand All @@ -198,13 +215,13 @@ struct attribute_sequence {

auto& get(dw::at name) {
auto [valid, iterator] = find(name);
assert(valid);
ADOBE_INVARIANT(valid);
return *iterator;
}

const auto& get(dw::at name) const {
auto [valid, iterator] = find(name);
assert(valid);
ADOBE_INVARIANT(valid);
return *iterator;
}

Expand All @@ -216,6 +233,14 @@ struct attribute_sequence {
return get(name).uint();
}

int number(dw::at name) const {
return get(name)._value.number();
}

std::int64_t sint(dw::at name) const {
return get(name).sint();
}

pool_string string(dw::at name) const {
return get(name).string();
}
Expand All @@ -237,14 +262,26 @@ struct attribute_sequence {
auto end() { return _attributes.end(); }
auto end() const { return _attributes.end(); }

void erase(dw::at name) {
auto [valid, iterator] = find(name);
ADOBE_INVARIANT(valid);
_attributes.erase(iterator);
}

void move_append(attribute_sequence&& rhs) {
_attributes.insert(_attributes.end(), std::move_iterator(rhs.begin()), std::move_iterator(rhs.end()));
}

private:
/// NOTE: Consider sorting these attribues by `dw::at` to improve performance.
std::tuple<bool, iterator> find(dw::at name) {
auto result = std::find_if(_attributes.begin(), _attributes.end(), [&](const auto& attr){
return attr._name == name;
});
return std::make_tuple(result != _attributes.end(), result);
}

/// NOTE: Consider sorting these attribues by `dw::at` to improve performance.
std::tuple<bool, const_iterator> find(dw::at name) const {
auto result = std::find_if(_attributes.begin(), _attributes.end(), [&](const auto& attr){
return attr._name == name;
Expand Down Expand Up @@ -423,6 +460,22 @@ using dies = std::vector<die>;

/**************************************************************************************************/

/**
* @brief Determines if a DWARF attribute is considered non-fatal for ODRV purposes
*
* This function identifies attributes that can be safely ignored when checking for
* One Definition Rule Violations (ODRVs). These attributes typically contain
* information that doesn't affect the actual definition of a symbol, such as
* debug-specific metadata or compiler-specific extensions.
*
* @param at The DWARF attribute to check
*
* @return true if the attribute is non-fatal and can be ignored for ODRV checks,
* false if the attribute must be considered when checking for ODRVs
*
* @pre The attribute must be a valid DWARF attribute
* @post The return value will be consistent with the internal list of nonfatal attributes
*/
bool nonfatal_attribute(dw::at at);
inline bool fatal_attribute(dw::at at) { return !nonfatal_attribute(at); }

Expand Down
Loading
Loading