Skip to content

Event-based Malware check#7249

Merged
ewdurbin merged 41 commits intopypi:malware-detectionfrom
trail-of-forks:ww/yara-malware-check
Jan 27, 2020
Merged

Event-based Malware check#7249
ewdurbin merged 41 commits intopypi:malware-detectionfrom
trail-of-forks:ww/yara-malware-check

Conversation

@woodruffw
Copy link
Copy Markdown
Member

@woodruffw woodruffw commented Jan 16, 2020

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the setup.pys of release files for suspicious
patterns.

Closes #7198.

@woodruffw woodruffw added the malware-detection Issues related to automated malware detection. label Jan 16, 2020
Comment thread warehouse/malware/checks/example.py
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
In progress.

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the `setup.py`s of release files for suspicious
patterns.
Fiddle with the check/run signature a bit more.
The worker needs to be able to see the "files" virtual host
during development so that malware checks can fetch their underlying
release files.
@woodruffw woodruffw force-pushed the ww/yara-malware-check branch from 4916ffe to d8cc3d4 Compare January 17, 2020 20:14
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread warehouse/malware/checks/setup_patterns/check.py
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
@woodruffw woodruffw changed the title WIP: Event-based Malware check Event-based Malware check Jan 23, 2020
Comment thread warehouse/malware/models.py
Comment thread warehouse/malware/checks/base.py
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread warehouse/malware/checks/setup_patterns/check.py Outdated
Comment thread tests/unit/malware/test_setup_patterns.py Outdated
@xmunoz
Copy link
Copy Markdown
Contributor

xmunoz commented Jan 23, 2020

Some minor comments about exception handling, otherwise LGTM.

@ewdurbin ewdurbin merged commit af2f66e into pypi:malware-detection Jan 27, 2020
@woodruffw woodruffw deleted the ww/yara-malware-check branch January 27, 2020 15:18
ewdurbin pushed a commit that referenced this pull request Jan 27, 2020
* requirements: Introduce yara

* [WIP] malware/check: SetupPatternCheck

In progress.

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the `setup.py`s of release files for suspicious
patterns.

* malware/checks: Give MalwareCheckBase.run/scan args, kwargs

* malware: Add check preparation

Fiddle with the check/run signature a bit more.

* malware/checks: Unpack file path correctly

* docker-compose: Override FILES_BACKEND for worker

The worker needs to be able to see the "files" virtual host
during development so that malware checks can fetch their underlying
release files.

* [WIP] malware/checks: setup.py extraction

* malware/checks: setup_patterns: Fix enum, seek

* malware/checks: setup_patterns: Apply YARA rules

Each rule match becomes a verdict.

* malware/checks: setup_patterns: Prefer get over filter

* warehouse/{admin,malware}: Consistent enum names

Also enforce uniqueness for enum values.

* warehouse/{admin,malware}: More enum changes

* tests: Update admin, malware tests

* tests: Fix enum, more test fixes

* tests: Add prepare tests

* malware/changes: base: Unpack id correctly

* tests: Begin adding SetupPatternCheck tests

* malware/checks: setup_patterns: Fix enum

* tests: More SetupPatternCheck tests

* warehouse/malware: setup_patterns: Fix enums

* tests: More SetupPatternCheck tests

* tests: Add license header

* malware/checks: setup_patterns: Add TODO

* tests: More SetupPatternCheck tests

* tests: More SetupPatternCheck tests

* tests: Complete extraction tests for SetupPatternCheck

* tests: Fix test

* malware/checks: Add docstring for prepare

* malware/checks: blacken

* malware/checks: Document, expand YARA rules

* tests, warehouse: Restructure utilities

* malware: Order some enums, reduce SetupPatternCheck verdicts

* malware/models: Add missing __lt__

* malware/checks: Always embed the model object in the prepared arguments

Use it instead of performing a DB request in the check itself.

* malware/checks: Avoid raw bytes

* malware/changes: Remove unused import

* tests: Fixup malware tests

* warehouse/malware: blacken

* tests: Fill in malware coverage

* tests, warehouse: Add a benign verdict for SetupPatternCheck

* tests: blacken
woodruffw added a commit to trail-of-forks/warehouse that referenced this pull request Feb 7, 2020
* requirements: Introduce yara

* [WIP] malware/check: SetupPatternCheck

In progress.

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the `setup.py`s of release files for suspicious
patterns.

* malware/checks: Give MalwareCheckBase.run/scan args, kwargs

* malware: Add check preparation

Fiddle with the check/run signature a bit more.

* malware/checks: Unpack file path correctly

* docker-compose: Override FILES_BACKEND for worker

The worker needs to be able to see the "files" virtual host
during development so that malware checks can fetch their underlying
release files.

* [WIP] malware/checks: setup.py extraction

* malware/checks: setup_patterns: Fix enum, seek

* malware/checks: setup_patterns: Apply YARA rules

Each rule match becomes a verdict.

* malware/checks: setup_patterns: Prefer get over filter

* warehouse/{admin,malware}: Consistent enum names

Also enforce uniqueness for enum values.

* warehouse/{admin,malware}: More enum changes

* tests: Update admin, malware tests

* tests: Fix enum, more test fixes

* tests: Add prepare tests

* malware/changes: base: Unpack id correctly

* tests: Begin adding SetupPatternCheck tests

* malware/checks: setup_patterns: Fix enum

* tests: More SetupPatternCheck tests

* warehouse/malware: setup_patterns: Fix enums

* tests: More SetupPatternCheck tests

* tests: Add license header

* malware/checks: setup_patterns: Add TODO

* tests: More SetupPatternCheck tests

* tests: More SetupPatternCheck tests

* tests: Complete extraction tests for SetupPatternCheck

* tests: Fix test

* malware/checks: Add docstring for prepare

* malware/checks: blacken

* malware/checks: Document, expand YARA rules

* tests, warehouse: Restructure utilities

* malware: Order some enums, reduce SetupPatternCheck verdicts

* malware/models: Add missing __lt__

* malware/checks: Always embed the model object in the prepared arguments

Use it instead of performing a DB request in the check itself.

* malware/checks: Avoid raw bytes

* malware/changes: Remove unused import

* tests: Fixup malware tests

* warehouse/malware: blacken

* tests: Fill in malware coverage

* tests, warehouse: Add a benign verdict for SetupPatternCheck

* tests: blacken
ewdurbin pushed a commit that referenced this pull request Feb 11, 2020
* requirements: Introduce yara

* [WIP] malware/check: SetupPatternCheck

In progress.

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the `setup.py`s of release files for suspicious
patterns.

* malware/checks: Give MalwareCheckBase.run/scan args, kwargs

* malware: Add check preparation

Fiddle with the check/run signature a bit more.

* malware/checks: Unpack file path correctly

* docker-compose: Override FILES_BACKEND for worker

The worker needs to be able to see the "files" virtual host
during development so that malware checks can fetch their underlying
release files.

* [WIP] malware/checks: setup.py extraction

* malware/checks: setup_patterns: Fix enum, seek

* malware/checks: setup_patterns: Apply YARA rules

Each rule match becomes a verdict.

* malware/checks: setup_patterns: Prefer get over filter

* warehouse/{admin,malware}: Consistent enum names

Also enforce uniqueness for enum values.

* warehouse/{admin,malware}: More enum changes

* tests: Update admin, malware tests

* tests: Fix enum, more test fixes

* tests: Add prepare tests

* malware/changes: base: Unpack id correctly

* tests: Begin adding SetupPatternCheck tests

* malware/checks: setup_patterns: Fix enum

* tests: More SetupPatternCheck tests

* warehouse/malware: setup_patterns: Fix enums

* tests: More SetupPatternCheck tests

* tests: Add license header

* malware/checks: setup_patterns: Add TODO

* tests: More SetupPatternCheck tests

* tests: More SetupPatternCheck tests

* tests: Complete extraction tests for SetupPatternCheck

* tests: Fix test

* malware/checks: Add docstring for prepare

* malware/checks: blacken

* malware/checks: Document, expand YARA rules

* tests, warehouse: Restructure utilities

* malware: Order some enums, reduce SetupPatternCheck verdicts

* malware/models: Add missing __lt__

* malware/checks: Always embed the model object in the prepared arguments

Use it instead of performing a DB request in the check itself.

* malware/checks: Avoid raw bytes

* malware/changes: Remove unused import

* tests: Fixup malware tests

* warehouse/malware: blacken

* tests: Fill in malware coverage

* tests, warehouse: Add a benign verdict for SetupPatternCheck

* tests: blacken
ewdurbin pushed a commit that referenced this pull request Feb 18, 2020
* requirements: Introduce yara

* [WIP] malware/check: SetupPatternCheck

In progress.

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the `setup.py`s of release files for suspicious
patterns.

* malware/checks: Give MalwareCheckBase.run/scan args, kwargs

* malware: Add check preparation

Fiddle with the check/run signature a bit more.

* malware/checks: Unpack file path correctly

* docker-compose: Override FILES_BACKEND for worker

The worker needs to be able to see the "files" virtual host
during development so that malware checks can fetch their underlying
release files.

* [WIP] malware/checks: setup.py extraction

* malware/checks: setup_patterns: Fix enum, seek

* malware/checks: setup_patterns: Apply YARA rules

Each rule match becomes a verdict.

* malware/checks: setup_patterns: Prefer get over filter

* warehouse/{admin,malware}: Consistent enum names

Also enforce uniqueness for enum values.

* warehouse/{admin,malware}: More enum changes

* tests: Update admin, malware tests

* tests: Fix enum, more test fixes

* tests: Add prepare tests

* malware/changes: base: Unpack id correctly

* tests: Begin adding SetupPatternCheck tests

* malware/checks: setup_patterns: Fix enum

* tests: More SetupPatternCheck tests

* warehouse/malware: setup_patterns: Fix enums

* tests: More SetupPatternCheck tests

* tests: Add license header

* malware/checks: setup_patterns: Add TODO

* tests: More SetupPatternCheck tests

* tests: More SetupPatternCheck tests

* tests: Complete extraction tests for SetupPatternCheck

* tests: Fix test

* malware/checks: Add docstring for prepare

* malware/checks: blacken

* malware/checks: Document, expand YARA rules

* tests, warehouse: Restructure utilities

* malware: Order some enums, reduce SetupPatternCheck verdicts

* malware/models: Add missing __lt__

* malware/checks: Always embed the model object in the prepared arguments

Use it instead of performing a DB request in the check itself.

* malware/checks: Avoid raw bytes

* malware/changes: Remove unused import

* tests: Fixup malware tests

* warehouse/malware: blacken

* tests: Fill in malware coverage

* tests, warehouse: Add a benign verdict for SetupPatternCheck

* tests: blacken
ewdurbin added a commit that referenced this pull request Feb 18, 2020
* Add new models for malware detection. (#7118)

* Add new models for malware detection.

Fixes #7090 and #7092.

* Code review changes.

- FK on release_file.id field instead of md5
- Change message type from String to Text
- Change Enum class in model to singular form

* Add admin interface to view and enable checks (#7134)

* Add admin interface to view and enable checks

- Implement list, detail and change_state views (#7133)
- Add unit tests for check admin view

* Add comprehensive test coverage for check admin

* Add initial hook-based check execution mechanism (#7160)

* Add initial hook-based check execution mechanism

* scratch/poc

* Add initial hook-based check execution mechanism

* Use sqlalchemy event hooks for malware checks

* Fix unit tests

* Add enum for MalwareCheckObjectType

* Add unit tests for init.

* Add tests for tasks, services, and utils.

Also, some small bugfixes in MalwareCheckFactory and the
get_enabled_checks method.

* Fix spurious task test.

* Add missing drop enum to downgrade function.

* Added TODO to dev/environment

* Be more explicit in check lookup

Co-authored-by: Ernest W. Durbin III <ewdurbin@gmail.com>

* Add malware check syncing mechanism (#7190)

* Add malware check syncing mechanism

* Code review changes.

* Refactor MalwareCheckBase. Fixes #7091. (#7196)

* Refactor MalwareCheckBase. Fixes #7091.

Add Foreign Keys in MalwareVerdicts for other types of objects
(Releases, Projects).

* Change verdict dict to kwargs.

* Add wipe-out functionality (#7202)

* Add wipe-out functionality

Related: #7133

* Call list explicitly

* Add rudimentary verdicts view. Progress on #6062. (#7207)

* Add rudimentary verdicts view. Progress on #6062.

Also, add some better testing logic for wiped_out condition.

* Code review changes.

- Conditionally show fields that are populated
- JSON pretty formatting

* Fix unit test bug.

- Use `get` instead of `filter` to look up verdict by pkey.

* simplify unit tests for verdicts view

* introduce malware queue (#7227)

* introduce malware queue

* correct syntax, apparently list of tuples documented doesn't work.

* Add backfill functionality to check admin #7094 (#7232)

* Add backfill functionality to check admin #7094

- Add backfill task
- Change lookup of checks to check_name instead of id
- Load checks that are also in "evaluation" state

* Add unit tests for backfill.

- Log number of runs executed by backfill
- Perform basic validation on sample_rate input
- Clean up other testing logic.

* Remove superfluous 'all()'

* Code review changes.

- Set backfill size to a fix number, not configurable via web ui.
- Backfill task enqueues run_check tasks
- Only retry if `check.run` fails, not if loading the check fails.
- Use exponential backoff for retries.

* Update warehouse/admin/templates/admin/malware/checks/detail.html

Co-Authored-By: Ernest W. Durbin III <ewdurbin@gmail.com>

Co-authored-by: Ernest W. Durbin III <ewdurbin@gmail.com>

* Refactor testing logic #7098 (#7257)

- Add `schedule` field to MalwareCheck model #7096
- Move ExampleCheck into tests/common/ to remove test dependency from
prod code
- Rename functions and classes to differentiate between "hooked" and
"scheduled" checks

* Event-based Malware check (#7249)

* requirements: Introduce yara

* [WIP] malware/check: SetupPatternCheck

In progress.

Introduces SetupPatternCheck, an implementation of an event-based
check that scans the `setup.py`s of release files for suspicious
patterns.

* malware/checks: Give MalwareCheckBase.run/scan args, kwargs

* malware: Add check preparation

Fiddle with the check/run signature a bit more.

* malware/checks: Unpack file path correctly

* docker-compose: Override FILES_BACKEND for worker

The worker needs to be able to see the "files" virtual host
during development so that malware checks can fetch their underlying
release files.

* [WIP] malware/checks: setup.py extraction

* malware/checks: setup_patterns: Fix enum, seek

* malware/checks: setup_patterns: Apply YARA rules

Each rule match becomes a verdict.

* malware/checks: setup_patterns: Prefer get over filter

* warehouse/{admin,malware}: Consistent enum names

Also enforce uniqueness for enum values.

* warehouse/{admin,malware}: More enum changes

* tests: Update admin, malware tests

* tests: Fix enum, more test fixes

* tests: Add prepare tests

* malware/changes: base: Unpack id correctly

* tests: Begin adding SetupPatternCheck tests

* malware/checks: setup_patterns: Fix enum

* tests: More SetupPatternCheck tests

* warehouse/malware: setup_patterns: Fix enums

* tests: More SetupPatternCheck tests

* tests: Add license header

* malware/checks: setup_patterns: Add TODO

* tests: More SetupPatternCheck tests

* tests: More SetupPatternCheck tests

* tests: Complete extraction tests for SetupPatternCheck

* tests: Fix test

* malware/checks: Add docstring for prepare

* malware/checks: blacken

* malware/checks: Document, expand YARA rules

* tests, warehouse: Restructure utilities

* malware: Order some enums, reduce SetupPatternCheck verdicts

* malware/models: Add missing __lt__

* malware/checks: Always embed the model object in the prepared arguments

Use it instead of performing a DB request in the check itself.

* malware/checks: Avoid raw bytes

* malware/changes: Remove unused import

* tests: Fixup malware tests

* warehouse/malware: blacken

* tests: Fill in malware coverage

* tests, warehouse: Add a benign verdict for SetupPatternCheck

* tests: blacken

* Implement scheduled checks #7093 (#7271)

* Implement scheduled checks #7093

- Rename `run_backfill` to `run_evaluation` in admin malware view
- Modify `run` and `scan` method signatures to accept `**kwargs`
- Extend `run_check` to accomodate scheduled check functionality

* Reduce unit test flakiness

* Code review changes.

Also replace `check.hooked_object` with `check.hooked_object.value` in
check detail template.

* tests, warehouse: enum fixes

* Fix lint error

Co-authored-by: William Woodruff <william@yossarian.net>

*  Add verdicts view filtering capabilities #6062. (#7322)

*  Add verdicts view filtering capabilities #6062.

* Code review changes.

- Refactor tests to be parametrized.
- Pass `_query` to `route_path` in template.
- Remove `is None` from filter query, it adds nothing.

* Add verdict administrator review. Fixes #6062. (#7339)

* Add verdict administrator review. Fixes #6062.

- Add new `admin.verdicts.review` endpoint
- Change layout of verdict list and detail view and add forms
- Change sort order of the MalwareChecks, and update the tests

* Code review changes.

- Rename MalwareVerdict field `administrator_verdict` to `reviewer_verdict`.
- Change verdict review permission from `admin` to `moderator`.

* Misc cleanup and TODOs on malware checks. (#7355)

* Misc cleanup and TODOs on malware checks.

    - Change backfill function to invoke `IMalwareCheckService` interface
    - Add support for `kwargs to `IMalwareCheckService` interface
    - Rename variable from reserved word `file` to `release_file`
    - Add `FatalCheckException` for non-retryable exceptions
    - Replace `MALWARE_CHECK_BACKEND` in dev/environment

* Make `IMalwareService` the entrypoint for `run_check`

- Add `run_scheduled_check` task that invokes this interface.
- Remove useless utility method
- Move `FatalCheckException` into warehouse/malware/errors.py.

* malware/checks: PackageTurnover skeleton (#7321)

* malware/checks: PackageTurnover skeleton

* malware/checks: PackageTurnover: Add NOTE

* malware/checks: PackageTurnoverCheck: more work

* tests: blacken

* malware/checks: More PackageTurnoverCheck work

* malware/checks: Blacken

* malware/checks: Blacken

* package_turnover: Promote from indeterminate to threat

* tests: Begin adding package_turnover tests

* tests: Add remaining package_turnover tests

* tests: Drop unused imports

* warehouse: Drop (ww) from NOTE

* checks/package_turnover: Drop NOTE

Co-authored-by: Cristina <hi@xmunoz.com>
Co-authored-by: William Woodruff <william@yossarian.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

malware-detection Issues related to automated malware detection.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants