Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

[FEATURE] Restore Quantization API to MXNet#19587

Merged
pengzhao-intel merged 40 commits intoapache:masterfrom
bgawrych:quantization_20
Dec 12, 2020
Merged

[FEATURE] Restore Quantization API to MXNet#19587
pengzhao-intel merged 40 commits intoapache:masterfrom
bgawrych:quantization_20

Conversation

@bgawrych
Copy link
Copy Markdown
Contributor

Description

This PR restores quantization API and some examples to master branch of MXNet. Change prepared together with @sfraczek and @grygielski
Quantization API now utilizes HybridBlock as symbol executor and new features like optimize_for
Moreover:

  • enabled numpy semantics support
  • improved custom calibration flow (user can now use layer names which need calibration in their layer output collectors)
  • renamed quantize_net_v2 to quantize_net
  • using DataLoader instead of DataIter in model calibration
  • retained quantize_model and quantize_model_mkldnn which works with symbol

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

@anko-intel @sfraczek @grygielski @TaoLv @pengzhao-intel @ciyongch

@mxnet-bot
Copy link
Copy Markdown

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, website, unix-cpu, clang, unix-gpu, centos-gpu, windows-cpu, miscellaneous, centos-cpu, edge, windows-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 25, 2020
Bartlomiej Gawrych and others added 24 commits November 27, 2020 10:09
This reverts commit a8b737a473ca6529a1969b748ea03c40e12c0798.
Needs refactor of conv and fc common part
'if calib_data is not None' and 'if not data_shapes'
wip
merge with bgawrych

cannot do inplace convolution and the sum and input tesnsors are shared already

remove cout

spaces after if

refactor if else
Conflicts:
	src/operator/subgraph/mkldnn/mkldnn_conv_property.h
review fixes
remove unused parameters
change rgb

small fix

add alexnet exclude

fix filename suffix

refactor first conv exclude v1

v2

v3

fix names of layers

fix bug
@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020
@bgawrych
Copy link
Copy Markdown
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link
Copy Markdown

Jenkins CI successfully triggered : [unix-gpu]

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Dec 10, 2020
@bgawrych
Copy link
Copy Markdown
Contributor Author

Do we need to start a new RFC or design doc as demonstrated originally in [1] and [2]?

[1] https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN
[2] #9552

We haven't changed design of quantization - all code changes in backend are related to changed node naming conventions. I don't know if in this case we must propose new RFC as previous one was accepted and all what we have done here is bringing it back

@pengzhao-intel
Copy link
Copy Markdown
Contributor

We don't need a new RFC since this is the same approach (quantization flow) migration from 1.x to 2.x.

Copy link
Copy Markdown
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and I will merge soon if no other comments.

@lanking520 lanking520 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020
@bgawrych
Copy link
Copy Markdown
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link
Copy Markdown

Jenkins CI successfully triggered : [unix-gpu]

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020
@bgawrych
Copy link
Copy Markdown
Contributor Author

@szha is something wrong with unix-gpu CI? I got following error two times in a row:

  • [2020-12-10T12:17:34.769Z] Cannot contact mxnetlinux-cpu_cwcmun1huz: java.lang.InterruptedException
    ...
    [2020-12-10T12:28:54.084Z] FAILED: CMakeFiles/mxnet.dir/src/operator/numpy/np_kron.cc.o
    ...
    [2020-12-10T12:28:54.084Z] c++: fatal error: Killed signal terminated program cc1plus

but in run before in GPU: MKLDNN job it was CMakeFiles/mxnet.dir/src/operator/numpy/np_elemwise_broadcast_logic_op.cc.o

I don't think it's releated to my change

@bgawrych
Copy link
Copy Markdown
Contributor Author

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link
Copy Markdown

Jenkins CI successfully triggered : [unix-gpu]

@lanking520 lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-merge Review and CI is complete. Ready to Merge and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020
@leezu
Copy link
Copy Markdown
Contributor

leezu commented Dec 10, 2020

@bgawrych you can refer to #19623 for the CI issue. It looks like an effect of #18501 together with memory usage increase in recent versions of gcc. If you have time to help fix it, that would be great.

@pengzhao-intel
Copy link
Copy Markdown
Contributor

It's great the CI passed.

Thanks the great efforts from the team to re-enable quantization flow in the MXNet 2.0: )

I am going to merge this PR. @anko-intel @bgawrych

Copy link
Copy Markdown
Contributor

@pengzhao-intel pengzhao-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging now.

@pengzhao-intel pengzhao-intel merged commit 3746bab into apache:master Dec 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

pr-awaiting-merge Review and CI is complete. Ready to Merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants