Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken & inaccurate detector unit tests #3817

Open
rgmz opened this issue Dec 26, 2024 · 4 comments
Open

Fix broken & inaccurate detector unit tests #3817

rgmz opened this issue Dec 26, 2024 · 4 comments
Labels

Comments

@rgmz
Copy link
Contributor

rgmz commented Dec 26, 2024

The detector unit tests were created by "reverse engineering" the patterns, thus they do not provide any real value or confirm how accurate the detectors are.

Additionally, the structure of the tests are (in my opinion) not maintainable as they are difficult to understand and do not reflect any realistic scenarios.

var (
validConsumerKey = "3WaMEd0KQtHSU7b24HEd79RZzSpMOfMdMUpIaXjq83DbNHVosCVrEVDxKiEQzT15"
invalidConsumerKey = "3Wa?Ed0KQtHSU7b24HEd79RZzSpMOfMdMUpIaXjq83DbNHVosCVrEVDxKiEQzT15"
validConsumerSecret = "5BZ70LfNshsJkDya1XaD8bMqtPWlOa2o1yKCk0H2DxnjtoaJKIcAw75GdI6zRaRD"
invalidConsumerSecret = "5BZ70LfNshsJkDya?XaD8bMqtPWlOa2o1yKCk0H2DxnjtoaJKIcAw75GdI6zRaRD"
validTokenKey = "KeYcG56ViFDleXPFJuEQ5CAGSJn7o2WDa5iGvLIvVBqZj5rMkaWFmzkp4bveJa74"
invalidTokenKey = "KeYcG56ViFDleXPFJuEQ5CAGSJn7o2WD?5iGvLIvVBqZj5rMkaWFmzkp4bveJa74"
validTokenSecret = "GGQUdyYOGDfDImJWCz4Kufk2GevaIDuVv83kIa9zCRuXIDLB4oh2eVDVPmsaSai2"
invalidTokenSecret = "GGQUdyYOGDfDImJWCz4Kufk2Ge?aIDuVv83kIa9zCRuXIDLB4oh2eVDVPmsaSai2"
validAccountID = "x1L2_BXo"
invalidAccountID = "x1L2?BXo"
keyword = "netsuite"
inputFormat = `%s id - '%s'
consumer - '%s' consumer - '%s'
token - '%s' token - '%s'`
outputPair1 = validConsumerKey + validConsumerSecret
outputPair2 = validConsumerSecret + validConsumerKey
)
func TestNetsuite_Pattern(t *testing.T) {
d := Scanner{}
ahoCorasickCore := ahocorasick.NewAhoCorasickCore([]detectors.Detector{d})
tests := []struct {
name string
input string
want []string
}{
{
name: "valid pattern - with keyword netsuite",
input: fmt.Sprintf(inputFormat, keyword, validAccountID, validConsumerKey, validConsumerSecret, validTokenKey, validTokenSecret),
want: []string{outputPair1, outputPair2, outputPair1, outputPair2},

Examples

There are dozens, if not hundreds, of problematic test files. These are illustrative.

BombBomb

The "valid" tests for BombBomb do not match the detector's pattern.

keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"bombbomb"}) + common.BuildRegexJWT("0,140", "0,419", "0,171"))

validPattern = "HUmGL.17uQMEShYp2RVMR8vypd1iqj6FZcKkQ4SazuMkbEKhzRFKuvOiwYmNWPSvkE4wiLOv-zWTkK1WkVTScRb9_io0_kvhYX31tpwR3lAJUh27RJzf1BehaJTQDXhJB6aT2gQ2LMT7dda-b3vhmEuZHzPV9AMLV6cOrcqOTkK60vMcB0PTLRQ3c_kY.a.9.hRvgogdlI8mQJrzD0myPBY7lMpjpkcskQDpOgz2I37kNDYhf7IxT6sG-a7rI1LdpJ6HhJacktlNJSswST9jbt4A0ropfJJTHGny2aId4WyPpAnQubM98F1BUnyhfkDzenaUuuQ_ZoPn9mAOsdLQUlAyp4I9oLJ_v8yQ0Q4M.Yujscho9G4ZbVTInC2mP8taCPZdRK5qt-UfAF0CX9B4E0F9NItMUbRdbm3xIkl8C6iPUcgY5OTQDBSJRLKBJgIaEyyXe10pPw.qOUhLKNPcg5qPs1xhgBsZKfW2hNTff2dCL5h6E.940ojPuT0Iw90Q8kpQ2UzeUJrhXH9_GUANKA.pjD0-YcGpnlVEDouyXaXowUoh8pLqD-BtBQfteqyFqz7THGDvQKikMy7wiBuJAo0HttMG3jw1zKtA3gM6_VIXo_K4WN6yz8Ow4n5f6Unn5zn4j2haKA4WWI5-1c8-mm7SF5VqYJVz42wBmRqB6MWXegJ7yLt_EoG1tJHftnHZ"

Kraken

The "valid" pattern is nonsensical and not correct base64 encoding. The detector should not match this, that is a defect.

validKeyPattern = "m=MN/0yYJ/5xqpE15JYDJtCFdDF7RDLuiXtTiSF1FU1H9waiub1kgwI= "

image
https://support.kraken.com/hc/en-us/articles/360000919966-How-to-create-an-API-key

viewneo

A few hundred detectors contain tests tightly coupled to the current implementation of PrefixRegex. Any changes to the prefix pattern will break the detector tests, which seems inadvisable.

{
name: "valid pattern - key out of prefix range",
input: fmt.Sprintf("%s keyword is not close to the real key in the data\n = '%s'", keyword, validPattern),
want: []string{},
},

@kashifkhan0771
Copy link
Contributor

The bombbomb pattern test case was functioning correctly when I wrote it. However, the failure was introduced by this pull request, which modified the pattern without updating the corresponding test cases. Moving forward, we should ensure that any changes to a pattern are accompanied by updates to the relevant test cases.

Additionally, I’d appreciate your input on some test cases I wrote to simulate real-world scenarios for various detectors. For example, take a look at the bombbomb detector complex pattern test. If this approach looks good, we can work on updating the remaining test cases to follow a similar strategy.

@nabeelalam
Copy link
Contributor

nabeelalam commented Dec 30, 2024

If this approach looks good, we can work on updating the remaining test cases to follow a similar strategy.

Agreed, and this may also involve updating detectors where the existing regex could potentially match incorrect patterns.

@rgmz
Copy link
Contributor Author

rgmz commented Dec 30, 2024

The bombbomb pattern test case was functioning correctly when I wrote it. However, the failure was introduced by this pull request, which modified the pattern without updating the corresponding test cases.

Small mistakes like that should be detected once #3773 is merged.

Additionally, I’d appreciate your input on some test cases I wrote to simulate real-world scenarios for various detectors. For example, take a look at the bombbomb detector complex pattern test.

That's a better approach, although you only really need 1/2 lines of surrounding context. I'd also suggest moving the inputs inline with the test cases (input: `...`) instead of sprintf'ing them.

	complexPattern = `
		
		bombbombToken := "HUmGL.17uQMEShYp2RVMR8vypd1iqj6FZcKkQ4SazuMkbEKhzRFKuvOiwYmNWPSvkE4wiLOv-zWTkK1WkVTScRb9_io0_kvhYX31tpwR3lAJUh27RJzf1BehaJTQDXhJB6aT2gQ2LMT7dda-b3vhmEuZHzPV9AMLV6cOrcqOTkK60vMcB0PTLRQ3c_kY.a.9.hRvgogdlI8mQJrzD0myPBY7lMpjpkcskQDpOgz2I37kNDYhf7IxT6sG-a7rI1LdpJ6HhJacktlNJSswST9jbt4A0ropfJJTHGny2aId4WyPpAnQubM98F1BUnyhfkDzenaUuuQ_ZoPn9mAOsdLQUlAyp4I9oLJ_v8yQ0Q4M.Yujscho9G4ZbVTInC2mP8taCPZdRK5qt-UfAF0CX9B4E0F9NItMUbRdbm3xIkl8C6iPUcgY5OTQDBSJRLKBJgIaEyyXe10pPw.qOUhLKNPcg5qPs1xhgBsZKfW2hNTff2dCL5h6E.940ojPuT0Iw90Q8kpQ2UzeUJrhXH9_GUANKA.pjD0-YcGpnlVEDouyXaXowUoh8pLqD-BtBQfteqyFqz7THGDvQKikMy7wiBuJAo0HttMG3jw1zKtA3gM6_VIXo_K4WN6yz8Ow4n5f6Unn5zn4j2haKA4WWI5-1c8-mm7SF5VqYJVz42wBmRqB6MWXegJ7yLt_EoG1tJHftnHZ"
		req.Header.Set("Authorization", bombbombToken)
	`

I tend to base test cases on results found from GitHub or SourceGraph search.
e.g., https://github.com/trufflesecurity/trufflehog/pull/3784/files#diff-8d1772a0a428fddbb34136f1096bc2461293e55d4faffcfe9a100070e11411e1

@kashifkhan0771
Copy link
Contributor

I can fix the bombbomb pattern test case as of now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants