[FEATURE] Restore Quantization API to MXNet by bgawrych · Pull Request #19587 · apache/mxnet

bgawrych · 2020-11-25T21:18:40Z

Description

This PR restores quantization API and some examples to master branch of MXNet. Change prepared together with @sfraczek and @grygielski
Quantization API now utilizes HybridBlock as symbol executor and new features like optimize_for
Moreover:

enabled numpy semantics support
improved custom calibration flow (user can now use layer names which need calibration in their layer output collectors)
renamed quantize_net_v2 to quantize_net
using DataLoader instead of DataIter in model calibration
retained quantize_model and quantize_model_mkldnn which works with symbol

Checklist

Essentials

PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented

@anko-intel @sfraczek @grygielski @TaoLv @pengzhao-intel @ciyongch

mxnet-bot · 2020-11-25T21:18:44Z

Hey @bgawrych , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

To trigger all jobs: @mxnet-bot run ci [all]
To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, website, unix-cpu, clang, unix-gpu, centos-gpu, windows-cpu, miscellaneous, centos-cpu, edge, windows-gpu]

Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

This reverts commit a8b737a473ca6529a1969b748ea03c40e12c0798. Needs refactor of conv and fc common part

'if calib_data is not None' and 'if not data_shapes'

merge with bgawrych cannot do inplace convolution and the sum and input tesnsors are shared already remove cout spaces after if refactor if else

Conflicts: src/operator/subgraph/mkldnn/mkldnn_conv_property.h

review fixes remove unused parameters change rgb small fix add alexnet exclude fix filename suffix refactor first conv exclude v1 v2 v3 fix names of layers fix bug

bgawrych · 2020-12-10T11:03:36Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2020-12-10T11:03:42Z

Jenkins CI successfully triggered : [unix-gpu]

bgawrych · 2020-12-10T11:26:39Z

Do we need to start a new RFC or design doc as demonstrated originally in [1] and [2]?

[1] https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimization+and+Quantization+based+on+subgraph+and+MKL-DNN
[2] #9552

We haven't changed design of quantization - all code changes in backend are related to changed node naming conventions. I don't know if in this case we must propose new RFC as previous one was accepted and all what we have done here is bringing it back

pengzhao-intel · 2020-12-10T11:32:47Z

We don't need a new RFC since this is the same approach (quantization flow) migration from 1.x to 2.x.

pengzhao-intel

LGTM and I will merge soon if no other comments.

bgawrych · 2020-12-10T12:06:25Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2020-12-10T12:06:29Z

Jenkins CI successfully triggered : [unix-gpu]

bgawrych · 2020-12-10T13:13:55Z

@szha is something wrong with unix-gpu CI? I got following error two times in a row:

[2020-12-10T12:17:34.769Z] Cannot contact mxnetlinux-cpu_cwcmun1huz: java.lang.InterruptedException
...
[2020-12-10T12:28:54.084Z] FAILED: CMakeFiles/mxnet.dir/src/operator/numpy/np_kron.cc.o
...
[2020-12-10T12:28:54.084Z] c++: fatal error: Killed signal terminated program cc1plus

but in run before in GPU: MKLDNN job it was CMakeFiles/mxnet.dir/src/operator/numpy/np_elemwise_broadcast_logic_op.cc.o

I don't think it's releated to my change

bgawrych · 2020-12-10T13:14:05Z

@mxnet-bot run ci [unix-gpu]

mxnet-bot · 2020-12-10T13:14:09Z

Jenkins CI successfully triggered : [unix-gpu]

leezu · 2020-12-10T16:22:32Z

@bgawrych you can refer to #19623 for the CI issue. It looks like an effect of #18501 together with memory usage increase in recent versions of gcc. If you have time to help fix it, that would be great.

pengzhao-intel · 2020-12-12T04:30:00Z

It's great the CI passed.

Thanks the great efforts from the team to re-enable quantization flow in the MXNet 2.0: )

I am going to merge this PR. @anko-intel @bgawrych

pengzhao-intel

Merging now.

bgawrych requested review from aaronmarkham and szha as code owners November 25, 2020 21:18

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Nov 25, 2020

Bartlomiej Gawrych and others added 24 commits November 27, 2020 10:09

Restore quantization files

840bc2d

Adapt quantization.py - Add/Remove modules

2f2adcb

Adapt part of quantization tests to new API

a2fd342

fuse fc+tanh

6d6ff15

Replace Module API with SymbolBlock in quantize_model

889a0dd

enabled test_quantization_mkldnn.py

2937d5b

Revert "fuse fc+tanh"

2ed842c

This reverts commit a8b737a473ca6529a1969b748ea03c40e12c0798. Needs refactor of conv and fc common part

Enable tests from test_subgraph.py

7694290

Enable test_mobilenetv2_struct

9364b30

Refactor test_subgraph.py

2da0f96

Reorder of conditions

b6c3740

'if calib_data is not None' and 'if not data_shapes'

Utilize optimize_for in quantization flow

d8af341

remove duplicate imports

2e8aab7

Add variable monitor callback

8bdfb0b

fix sanity

03a568f

wip

9c7483a

merge with bgawrych cannot do inplace convolution and the sum and input tesnsors are shared already remove cout spaces after if refactor if else

Rebase to master - remove with_seed

cf5376c

Add numpy support for quantization

07b3942

Conflicts: src/operator/subgraph/mkldnn/mkldnn_conv_property.h

enabled examples/quantization/imagenet_gen_qsym_mkldnn.py

a8f80e9

review fixes remove unused parameters change rgb small fix add alexnet exclude fix filename suffix refactor first conv exclude v1 v2 v3 fix names of layers fix bug

Add test to check different way of data generation for hybridize

e7428b2

Copy original network

fb7bf99

Change num_calib_examples to num_calib_batches

f095f25

enabling imagenet_inference.py

4e3591e

Add base class for collectors and feed custom with calib_layers

c898b14

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-review PR is waiting for code review pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-work-in-progress PR is still work in progress labels Dec 10, 2020

pengzhao-intel approved these changes Dec 10, 2020

View reviewed changes

lanking520 added pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020

lanking520 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-merge Review and CI is complete. Ready to Merge and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Dec 10, 2020

pengzhao-intel approved these changes Dec 12, 2020

View reviewed changes

pengzhao-intel merged commit 3746bab into apache:master Dec 12, 2020

Conversation

bgawrych commented Nov 25, 2020

Description

Checklist

Essentials

Uh oh!

mxnet-bot commented Nov 25, 2020

Uh oh!

bgawrych commented Dec 10, 2020

Uh oh!

mxnet-bot commented Dec 10, 2020

Uh oh!

bgawrych commented Dec 10, 2020

Uh oh!

pengzhao-intel commented Dec 10, 2020

Uh oh!

pengzhao-intel left a comment

Choose a reason for hiding this comment

Uh oh!

bgawrych commented Dec 10, 2020

Uh oh!

mxnet-bot commented Dec 10, 2020

Uh oh!

bgawrych commented Dec 10, 2020

Uh oh!

bgawrych commented Dec 10, 2020

Uh oh!

mxnet-bot commented Dec 10, 2020

Uh oh!

leezu commented Dec 10, 2020

Uh oh!

pengzhao-intel commented Dec 12, 2020

Uh oh!

pengzhao-intel left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants