Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Temporary chrome deployment #4440

Open
wants to merge 47 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
6f3dddc
add logs at the end of unpacking (#4380)
paulsemel Nov 6, 2024
c1bb8f1
Instrumenting build retrieval and unpacking times (#4339)
vitorguidi Nov 6, 2024
0d751b9
Using correct JOB_NAME env var for JOB_BUILD_RETRIEVAL_TIME (#4398)
vitorguidi Nov 12, 2024
8d0521b
build_manager: do not fetch fuzz_targets when not necessary (#4393)
paulsemel Nov 19, 2024
6e72d8a
[Monitoring] Adding a build age metric (#4341)
vitorguidi Nov 13, 2024
79002f9
Build retrieval improvements (#4412)
vitorguidi Nov 19, 2024
1b90746
Adding test case reproducibility metric (#4358)
vitorguidi Nov 4, 2024
d2dd5d6
[Monitoring] Collecting a metric for the age of untriaged testcases (…
vitorguidi Nov 13, 2024
e7bf2b3
[Monitoring] Testcase upload metrics for the triage lifecycle (#4364)
vitorguidi Nov 13, 2024
5404212
Add logging for misbehaving distribution metrics (#4429)
vitorguidi Nov 22, 2024
387fc03
Revert "Add logging for misbehaving distribution metrics (#4429)"
vitorguidi Nov 25, 2024
05f37d4
[Monitoring] Move histogram metrics to GeometricBucketer (#4432)
vitorguidi Nov 25, 2024
8394630
[Monitoring] Adding a blackbox fuzzer testcase generation time metric…
vitorguidi Nov 25, 2024
20237ab
[Monitoring] Adding metric to track time from testcase creation to bu…
vitorguidi Nov 25, 2024
ef74782
Fix Datetime not being including in data_types.py (#4439)
ParisMeuleman Nov 26, 2024
2083e89
Fix Analyze Task (#4441) (#4442)
vitorguidi Nov 26, 2024
382e2e1
Add analyze task postprocess tests
alhijazi Nov 27, 2024
1e4946d
[merge from master]build_manager: allow remote unpacking when unpacki…
paulsemel Dec 10, 2024
b7ccb1f
Merging task outcome metric and other minutia to chrome (#4491)
vitorguidi Dec 11, 2024
8e6d060
Merge #4492 to chrome temporary branch (#4493)
vitorguidi Dec 11, 2024
a5746d6
Merge 4494 into chrome temp branch (#4495)
vitorguidi Dec 12, 2024
978c311
[Monitoring] Remove granularity for stuck testcases metric (#4496) (#…
vitorguidi Dec 12, 2024
fc61aff
Merge #4500 to the chrome branch (#4501)
vitorguidi Dec 13, 2024
1461978
[Monitoring] Fix bad metric value for UNTRIAGED_TESTCASE_COUNT (#4502…
vitorguidi Dec 13, 2024
19fea40
Merge #4499 and #4481 into chrome branch (#4505)
vitorguidi Dec 16, 2024
80b37ea
Test
jonathanmetzman Dec 16, 2024
61f2558
Restore logging for android commands (#4480)
ParisMeuleman Dec 17, 2024
88a3abb
Add Fuzzilli cases to the test-input archive (#4515)
mi-ac Dec 18, 2024
2946c5d
Delete test push (#4509)
jonathanmetzman Dec 19, 2024
7bdd80b
Enable skipping minimization with an env var (#4527)
mi-ac Dec 20, 2024
ded95eb
Fix archiving Fuzzilli test cases (#4532)
mi-ac Dec 20, 2024
a9d3c55
Create a testcase reader to ingest and upload bug attachments (#4482)
pgrace-google Dec 20, 2024
3201444
Revert #4499 (#4512) (#4534)
jonathanmetzman Dec 20, 2024
6cc3008
Merging remainder PRs into chrome branch (#4546)
vitorguidi Dec 20, 2024
1a19d6d
google_issue_tracker: Set default priority. (#4379) (#4542)
jonathanmetzman Dec 20, 2024
56b7052
doc: Remove html b-tags from oss_fuzz_build_status.py (#4409) (#4539)
jonathanmetzman Dec 26, 2024
b2bbe59
Log minimization progress. (#4467) (#4536)
jonathanmetzman Dec 26, 2024
2a8afa6
Modify get_fastboot_path() to allow for custom binaries (#4518) (#4543)
jonathanmetzman Dec 26, 2024
8aa9a55
Make sure to schedule pruning for all jobs. (#4423) (#4540)
jonathanmetzman Dec 26, 2024
35301f4
Implement fuzzer weight setting. (#4392) (#4535)
jonathanmetzman Dec 26, 2024
8771f01
Skipping CCd users without GAIA accounts for buganizer issues (#4406)…
jonathanmetzman Dec 26, 2024
437a1e7
Allow configuring default corpora bucket location. (#4479) (#4541)
jonathanmetzman Dec 26, 2024
5646228
Fix OSS-fuzz builds status reporting. (#4378) (#4538)
jonathanmetzman Dec 26, 2024
8786d5c
Optimize get_artifacts_for_build() with regex filtering (#4297) (#4544)
jonathanmetzman Dec 26, 2024
c7759c7
Update device check in download_trusty_symbols_if_needed (#4475) (#4545)
jonathanmetzman Dec 26, 2024
5655b43
[Chrome deployment] Split utasks properly into success and error, add…
vitorguidi Dec 26, 2024
3a96f79
Close old non reproducible bugs (#4559)
pgrace-google Dec 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions butler.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,25 @@ def _add_weights_fuzzer_subparser(weights_subparsers):
aggregate_parser.add_argument(
'-j', '--jobs', help='Which jobs to aggregate.', nargs='+')

set_parser = subparsers.add_parser(
'set', help='Set the weight of a FuzzerJob entry.')
set_parser.add_argument(
'-f',
'--fuzzer',
help='The fuzzer field of the entry to modify.',
required=True)
set_parser.add_argument(
'-j',
'--job',
help='The job field of the entry to modify.',
required=True)
set_parser.add_argument(
'-w',
'--weight',
help='The new weight to set.',
type=float,
required=True)


def _add_weights_batches_subparser(weights_subparsers):
"""Adds a parser for the `weights fuzzer-batch` command."""
Expand Down
4 changes: 4 additions & 0 deletions src/clusterfuzz/_internal/bot/tasks/impact_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
from clusterfuzz._internal.build_management import build_manager
from clusterfuzz._internal.build_management import revisions
from clusterfuzz._internal.chrome import build_info
from clusterfuzz._internal.common import testcase_utils
from clusterfuzz._internal.datastore import data_handler
from clusterfuzz._internal.datastore import data_types
from clusterfuzz._internal.metrics import logs
Expand Down Expand Up @@ -326,4 +327,7 @@ def execute_task(testcase_id, job_type):
impacts = get_impacts_from_url(testcase.regression, testcase.job_type)
testcase = data_handler.get_testcase_by_id(testcase_id)
set_testcase_with_impacts(testcase, impacts)
testcase_utils.emit_testcase_triage_duration_metric(
testcase_id,
testcase_utils.TESTCASE_TRIAGE_DURATION_IMPACT_COMPLETED_STEP)
data_handler.update_testcase_comment(testcase, data_types.TaskState.FINISHED)
51 changes: 51 additions & 0 deletions src/clusterfuzz/_internal/bot/tasks/utasks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from clusterfuzz._internal.bot.webserver import http_server
from clusterfuzz._internal.metrics import logs
from clusterfuzz._internal.metrics import monitoring_metrics
from clusterfuzz._internal.protos import uworker_msg_pb2
from clusterfuzz._internal.system import environment

# Define an alias to appease pylint.
Expand Down Expand Up @@ -74,12 +75,26 @@ class _MetricRecorder(contextlib.AbstractContextManager):
Members:
start_time_ns (int): The time at which this recorder was constructed, in
nanoseconds since the Unix epoch.
utask_main_failure: this class stores the uworker_output.ErrorType
object returned by utask_main, and uses it to emmit a metric.
"""

def __init__(self, subtask: _Subtask):
self.start_time_ns = time.time_ns()
self._subtask = subtask
self._labels = None
self.utask_main_failure = None
self._utask_success_conditions = [
None, # This can be a successful return value in, ie, fuzz task
uworker_msg_pb2.ErrorType.NO_ERROR, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.ANALYZE_NO_CRASH, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.PROGRESSION_BAD_STATE_MIN_MAX, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.REGRESSION_NO_CRASH, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.REGRESSION_LOW_CONFIDENCE_IN_REGRESSION_RANGE, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.MINIMIZE_CRASH_TOO_FLAKY, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.LIBFUZZER_MINIMIZATION_UNREPRODUCIBLE, # pylint: disable=no-member
uworker_msg_pb2.ErrorType.ANALYZE_CLOSE_INVALID_UPLOADED, # pylint: disable=no-member
]

if subtask == _Subtask.PREPROCESS:
self._preprocess_start_time_ns = self.start_time_ns
Expand Down Expand Up @@ -121,6 +136,12 @@ def set_task_details(self,
# Ensure we always have a value after this method returns.
assert self._preprocess_start_time_ns is not None

def _infer_uworker_main_outcome(self, exc_type, uworker_error) -> bool:
"""Returns True if task succeeded, False otherwise."""
if exc_type or uworker_error not in self._utask_success_conditions:
return False
return True

def __exit__(self, _exc_type, _exc_value, _traceback):
# Ignore exception details, let Python continue unwinding the stack.

Expand All @@ -138,6 +159,31 @@ def __exit__(self, _exc_type, _exc_value, _traceback):
monitoring_metrics.UTASK_SUBTASK_E2E_DURATION_SECS.add(
e2e_duration_secs, self._labels)

# The only case where a task might fail without throwing, is in
# utask_main, by returning an ErrorType proto which indicates
# failure.
task_succeeded = self._infer_uworker_main_outcome(_exc_type,
self.utask_main_failure)
monitoring_metrics.TASK_OUTCOME_COUNT.increment({
**self._labels, 'task_succeeded': task_succeeded
})
if task_succeeded:
error_condition = 'N/A'
elif _exc_type:
error_condition = 'UNHANDLED_EXCEPTION'
else:
error_condition = uworker_msg_pb2.ErrorType.Name( # pylint: disable=no-member
self.utask_main_failure)
# Get rid of job as a label, so we can have another metric to make
# error conditions more explicit, respecting the 30k distinct
# labels limit recommended by gcp.
trimmed_labels = self._labels
del trimmed_labels['job']
trimmed_labels['task_succeeded'] = task_succeeded
trimmed_labels['error_condition'] = error_condition
monitoring_metrics.TASK_OUTCOME_COUNT_BY_ERROR_TYPE.increment(
trimmed_labels)


def ensure_uworker_env_type_safety(uworker_env):
"""Converts all values in |uworker_env| to str types.
Expand Down Expand Up @@ -226,6 +272,8 @@ def uworker_main_no_io(utask_module, serialized_uworker_input):
return None

# NOTE: Keep this in sync with `uworker_main()`.
if uworker_output.error_type != uworker_msg_pb2.ErrorType.NO_ERROR: # pylint: disable=no-member
recorder.utask_main_failure = uworker_output.error_type
uworker_output.bot_name = environment.get_value('BOT_NAME', '')
uworker_output.platform_id = environment.get_platform_id()

Expand Down Expand Up @@ -306,6 +354,9 @@ def uworker_main(input_download_url) -> None:
logs.info('Starting utask_main: %s.' % utask_module)
uworker_output = utask_module.utask_main(uworker_input)

if uworker_output.error_type != uworker_msg_pb2.ErrorType.NO_ERROR: # pylint: disable=no-member
recorder.utask_main_failure = uworker_output.error_type

# NOTE: Keep this in sync with `uworker_main_no_io()`.
uworker_output.bot_name = environment.get_value('BOT_NAME', '')
uworker_output.platform_id = environment.get_platform_id()
Expand Down
52 changes: 39 additions & 13 deletions src/clusterfuzz/_internal/bot/tasks/utasks/analyze_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
# limitations under the License.
"""Analyze task for handling user uploads."""

import datetime
import json
from typing import Dict
from typing import Optional
Expand All @@ -28,12 +27,14 @@
from clusterfuzz._internal.bot.tasks.utasks import uworker_io
from clusterfuzz._internal.build_management import build_manager
from clusterfuzz._internal.build_management import revisions
from clusterfuzz._internal.common import testcase_utils
from clusterfuzz._internal.crash_analysis import crash_analyzer
from clusterfuzz._internal.crash_analysis import severity_analyzer
from clusterfuzz._internal.datastore import data_handler
from clusterfuzz._internal.datastore import data_types
from clusterfuzz._internal.fuzzing import leak_blacklist
from clusterfuzz._internal.metrics import logs
from clusterfuzz._internal.metrics import monitoring_metrics
from clusterfuzz._internal.protos import uworker_msg_pb2
from clusterfuzz._internal.system import environment

Expand Down Expand Up @@ -118,7 +119,7 @@ def handle_analyze_no_revision_index(output):

def handle_analyze_close_invalid_uploaded(output):
testcase = data_handler.get_testcase_by_id(output.uworker_input.testcase_id)
testcase_upload_metadata = query_testcase_upload_metadata(
testcase_upload_metadata = testcase_utils.get_testcase_upload_metadata(
output.uworker_input.testcase_id)
data_handler.close_invalid_uploaded_testcase(
testcase, testcase_upload_metadata, 'Irrelevant')
Expand Down Expand Up @@ -258,7 +259,7 @@ def handle_noncrash(output):
tasks.add_task('analyze', output.uworker_input.testcase_id,
output.uworker_input.job_type)
return
testcase_upload_metadata = query_testcase_upload_metadata(
testcase_upload_metadata = testcase_utils.get_testcase_upload_metadata(
output.uworker_input.testcase_id)
data_handler.mark_invalid_uploaded_testcase(
testcase, testcase_upload_metadata, 'Unreproducible')
Expand Down Expand Up @@ -298,17 +299,24 @@ def utask_preprocess(testcase_id, job_type, uworker_env):
testcase = data_handler.get_testcase_by_id(testcase_id)
data_handler.update_testcase_comment(testcase, data_types.TaskState.STARTED)

testcase_upload_metadata = query_testcase_upload_metadata(testcase_id)
testcase_upload_metadata = testcase_utils.get_testcase_upload_metadata(
testcase_id)
if not testcase_upload_metadata:
logs.error('Testcase %s has no associated upload metadata.' % testcase_id)
testcase.key.delete()
return None

# Store the bot name and timestamp in upload metadata.
testcase_upload_metadata.bot_name = environment.get_value('BOT_NAME')
testcase_upload_metadata.timestamp = datetime.datetime.utcnow()
testcase_upload_metadata.put()

# Emmits a TESTCASE_TRIAGE_DURATION metric, in order to track the time
# elapsed between testcase upload and pulling the task from the queue.

testcase_utils.emit_testcase_triage_duration_metric(
int(testcase_id),
testcase_utils.TESTCASE_TRIAGE_DURATION_ANALYZE_LAUNCHED_STEP)

initialize_testcase_for_main(testcase, job_type)

setup_input = setup.preprocess_setup_testcase(testcase, uworker_env)
Expand Down Expand Up @@ -409,6 +417,14 @@ def utask_main(uworker_input):
analyze_task_output.crash_stacktrace = testcase.crash_stacktrace

if not crashed:
monitoring_metrics.ANALYZE_TASK_REPRODUCIBILITY.increment(
labels={
'fuzzer_name': uworker_input.fuzzer_name,
'job': uworker_input.job_type,
'crashes': False,
'reproducible': False,
'platform': environment.platform(),
})
return uworker_msg_pb2.Output( # pylint: disable=no-member
analyze_task_output=analyze_task_output,
error_type=uworker_msg_pb2.ErrorType.ANALYZE_NO_CRASH, # pylint: disable=no-member
Expand All @@ -425,8 +441,18 @@ def utask_main(uworker_input):

test_for_reproducibility(fuzz_target, testcase, testcase_file_path, state,
test_timeout)

analyze_task_output.one_time_crasher_flag = testcase.one_time_crasher_flag

monitoring_metrics.ANALYZE_TASK_REPRODUCIBILITY.increment(
labels={
'fuzzer_name': uworker_input.fuzzer_name,
'job': uworker_input.job_type,
'crashes': True,
'reproducible': not testcase.one_time_crasher_flag,
'platform': environment.platform(),
})

fuzz_target_metadata = engine_common.get_fuzz_target_issue_metadata(
fuzz_target)

Expand Down Expand Up @@ -461,7 +487,7 @@ def handle_build_setup_error(output):
output.uworker_input.job_type,
wait_time=testcase_fail_wait)
return
testcase_upload_metadata = query_testcase_upload_metadata(
testcase_upload_metadata = testcase_utils.get_testcase_upload_metadata(
output.uworker_input.testcase_id)
data_handler.mark_invalid_uploaded_testcase(
testcase, testcase_upload_metadata, 'Build setup failed')
Expand Down Expand Up @@ -526,18 +552,24 @@ def _update_testcase(output):
if analyze_task_output.platform_id:
testcase.platform_id = analyze_task_output.platform_id

testcase.analyze_pending = False

testcase.put()


def utask_postprocess(output):
"""Trusted: Cleans up after a uworker execute_task, writing anything needed to
the db."""
testcase_utils.emit_testcase_triage_duration_metric(
int(output.uworker_input.testcase_id),
testcase_utils.TESTCASE_TRIAGE_DURATION_ANALYZE_COMPLETED_STEP)
_update_testcase(output)
if output.error_type != uworker_msg_pb2.ErrorType.NO_ERROR: # pylint: disable=no-member
_ERROR_HANDLER.handle(output)
return

testcase = data_handler.get_testcase_by_id(output.uworker_input.testcase_id)
testcase_upload_metadata = query_testcase_upload_metadata(
testcase_upload_metadata = testcase_utils.get_testcase_upload_metadata(
output.uworker_input.testcase_id)

log_message = (f'Testcase crashed in {output.test_timeout} seconds '
Expand Down Expand Up @@ -592,9 +624,3 @@ def utask_postprocess(output):
# 5. Get second stacktrace from another job in case of
# one-time crashes (stack).
task_creation.create_tasks(testcase)


def query_testcase_upload_metadata(
testcase_id: str) -> Optional[data_types.TestcaseUploadMetadata]:
return data_types.TestcaseUploadMetadata.query(
data_types.TestcaseUploadMetadata.testcase_id == int(testcase_id)).get()
21 changes: 21 additions & 0 deletions src/clusterfuzz/_internal/bot/tasks/utasks/fuzz_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -1556,6 +1556,23 @@ def do_engine_fuzzing(self, engine_impl):

return crashes, fuzzer_metadata

def _emit_testcase_generation_time_metric(self, start_time, testcase_count,
fuzzer, job):
testcase_generation_finish = time.time()
elapsed_testcase_generation_time = testcase_generation_finish
elapsed_testcase_generation_time -= start_time
# Avoid division by zero.
if testcase_count:
average_time_per_testcase = elapsed_testcase_generation_time
average_time_per_testcase = average_time_per_testcase / testcase_count
monitoring_metrics.TESTCASE_GENERATION_AVERAGE_TIME.add(
average_time_per_testcase,
labels={
'job': job,
'fuzzer': fuzzer,
'platform': environment.platform(),
})

def do_blackbox_fuzzing(self, fuzzer, fuzzer_directory, job_type):
"""Run blackbox fuzzing. Currently also used for engine fuzzing."""
# Set the thread timeout values.
Expand All @@ -1579,11 +1596,15 @@ def do_blackbox_fuzzing(self, fuzzer, fuzzer_directory, job_type):

# Run the fuzzer to generate testcases. If error occurred while trying
# to run the fuzzer, bail out.
testcase_generation_start = time.time()
generate_result = self.generate_blackbox_testcases(
fuzzer, job_type, fuzzer_directory, testcase_count)
if not generate_result.success:
return None, None, None, None

self._emit_testcase_generation_time_metric(
testcase_generation_start, testcase_count, fuzzer.name, job_type)

environment.set_value('FUZZER_NAME', self.fully_qualified_fuzzer_name)

# Initialize a list of crashes.
Expand Down
Loading