Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .bazelrc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
common --enable_bzlmod
19 changes: 19 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ jobs:
strategy:
fail-fast: false
matrix:
# When adding another, also add to copybara's github_check_runs.
os: ['ubuntu-latest', 'macos-latest', 'windows-latest']
build_type: ['Release']
preset: ['make', 'windows']
Expand Down Expand Up @@ -54,3 +55,21 @@ jobs:
${{ github.workspace }}/build/${{ matrix.build_type }}/libgemma.lib
${{ github.workspace }}/build/gemma
${{ github.workspace }}/build/libgemma.a

bazel:
runs-on: ubuntu-latest
steps:
- name: Harden Runner
uses: step-security/harden-runner@63c24ba6bd7ba022e95695ff85de572c04a18142 # v2.7.0
with:
egress-policy: audit # cannot be block - runner does git checkout

- uses: actions/checkout@8ade135a41bc03ea155e62e844d188df1ea18608 # v4.0.0

- uses: bazelbuild/setup-bazelisk@b39c379c82683a5f25d34f0d062761f62693e0b2 # v3.0.0

- uses: actions/cache@ab5e6d0c87105b4c9c2047343972218f562e4319 # v4.0.1
with:
path: ~/.cache/bazel
key: bazel-${{ runner.os }}
- run: bazel build -c opt --cxxopt=-std=c++20 //...
59 changes: 20 additions & 39 deletions BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -25,21 +25,14 @@ cc_library(
],
deps = [
"//compression:compress",
# copybara:import_next_line:hwy
"//:algo",
# copybara:import_next_line:hwy
"//:dot",
# copybara:import_next_line:hwy
"//:hwy",
# copybara:import_next_line:hwy
"//:math",
# copybara:import_next_line:hwy
"//:matvec",
# copybara:import_next_line:hwy
"//:profiler",
# copybara:import_next_line:hwy
"//:thread_pool",
"//hwy/contrib/sort:vqsort",
"@hwy//:algo",
"@hwy//:dot",
"@hwy//:hwy",
"@hwy//:math",
"@hwy//:matvec",
"@hwy//:profiler",
"@hwy//:thread_pool",
"@hwy//hwy/contrib/sort:vqsort",
],
)

Expand All @@ -49,8 +42,7 @@ cc_library(
"util/args.h",
],
deps = [
# copybara:import_next_line:hwy
"//:hwy",
"@hwy//:hwy",
],
)

Expand All @@ -61,8 +53,7 @@ cc_library(
],
deps = [
":args",
# copybara:import_next_line:hwy
"//:hwy",
"@hwy//:hwy",
],
)

Expand All @@ -78,19 +69,13 @@ cc_library(
deps = [
":args",
":transformer_ops",
"//base",
"//compression:compress",
# copybara:import_next_line:hwy
"//:hwy",
# copybara:import_next_line:hwy
"//:matvec",
# copybara:import_next_line:hwy
"//:nanobenchmark", # timer
# copybara:import_next_line:hwy
"//:profiler",
# copybara:import_next_line:hwy
"//:thread_pool",
":sentencepiece_processor",
"@hwy//:hwy",
"@hwy//:matvec",
"@hwy//:nanobenchmark", # timer
"@hwy//:profiler",
"@hwy//:thread_pool",
"@com_google_sentencepiece//:sentencepiece_processor",
],
)

Expand All @@ -104,13 +89,9 @@ cc_binary(
":args",
":gemma_lib",
"//compression:compress",
# copybara:import_next_line:hwy
"//:hwy",
# copybara:import_next_line:hwy
"//:nanobenchmark",
# copybara:import_next_line:hwy
"//:profiler",
# copybara:import_next_line:hwy
"//:thread_pool",
"@hwy//:hwy",
"@hwy//:nanobenchmark",
"@hwy//:profiler",
"@hwy//:thread_pool",
],
)
17 changes: 13 additions & 4 deletions DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,13 +127,13 @@ working with weights, kv cache and activations (e.g. you might have multiple kv
caches and activations for a single set of weights) more directly rather than
only using a Gemma object.

## Use the tokenizer in the Gemma object (or interact with the Tokenizer object directly)
### Use the tokenizer in the Gemma object (or interact with the Tokenizer object directly)

You pretty much only do things with the tokenizer, call `Encode()` to go from
string prompts to token id vectors, or `Decode()` to go from token id vector
outputs from the model back to strings.

## The main entrypoint for generation is `GenerateGemma()`
### The main entrypoint for generation is `GenerateGemma()`

Calling into `GenerateGemma` with a tokenized prompt will 1) mutate the
activation values in `model` and 2) invoke StreamFunc - a lambda callback for
Expand All @@ -150,19 +150,28 @@ constrained decoding type of use cases where you want to force the generation
to fit a grammar. If you're not doing this, you can send an empty lambda as a
no-op which is what `run.cc` does.

## If you want to invoke the neural network forward function directly call the `Transformer()` function
### If you want to invoke the neural network forward function directly call the `Transformer()` function

For high-level applications, you might only call `GenerateGemma()` and never
interact directly with the neural network, but if you're doing something a bit
more custom you can call transformer which performs a single inference
operation on a single token and mutates the Activations and the KVCache through
the neural network computation.

## For low level operations, defining new architectures, call `ops.h` functions directly
### For low level operations, defining new architectures, call `ops.h` functions directly

You use `ops.h` if you're writing other NN architectures or modifying the
inference path of the Gemma model.

## Building with Bazel

The sentencepiece library we depend on requires some additional work to build
with the Bazel build system. First, it does not export its BUILD file, so we
provide `bazel/sentencepiece.bazel`. Second, it ships with a vendored subset of
the Abseil library. `bazel/com_google_sentencepiece.patch` changes the code to
support Abseil as a standalone dependency without third_party/ prefixes, similar
to the transforms we apply to Gemma via Copybara.

## Discord

We're also trying out a discord server for discussion here -
Expand Down
55 changes: 50 additions & 5 deletions MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,57 @@ module(
version = "0.1.0",
)

bazel_dep(
name = "rules_license",
version = "0.0.7",
bazel_dep(name = "rules_license", version = "0.0.7")
bazel_dep(name = "googletest", version = "1.14.0")

# Copied from Highway because Bazel does not load them transitively
bazel_dep(name = "bazel_skylib", version = "1.4.1")
bazel_dep(name = "rules_cc", version = "0.0.9")
bazel_dep(name = "platforms", version = "0.0.7")

http_archive = use_repo_rule("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
name = "hwy",
urls = ["https://github.com/google/highway/archive/refs/tags/1.1.0.zip"],
integrity = "sha256-zkJX2SwL4wQ0nHMsURW7MDLEf43vFSnqhSUsUM6eQmY=",
strip_prefix = "highway-1.1.0",
)

bazel_dep(
http_archive(
name = "com_google_sentencepiece",
version = "0.1.96",
sha256 = "8409b0126ebd62b256c685d5757150cf7fcb2b92a2f2b98efb3f38fc36719754",
strip_prefix = "sentencepiece-0.1.96",
urls = ["https://github.com/google/sentencepiece/archive/refs/tags/v0.1.96.zip"],
build_file = "@//bazel:sentencepiece.bazel",
patches = ["@//bazel:com_google_sentencepiece.patch"],
patch_args = ["-p1"],
)

# For sentencepiece
http_archive(
name = "darts_clone",
build_file_content = """
licenses(["notice"])
exports_files(["LICENSE"])
package(default_visibility = ["//visibility:public"])
cc_library(
name = "darts_clone",
hdrs = [
"include/darts.h",
],
)
""",
sha256 = "c97f55d05c98da6fcaf7f9ecc6a6dc6bc5b18b8564465f77abff8879d446491c",
strip_prefix = "darts-clone-e40ce4627526985a7767444b6ed6893ab6ff8983",
urls = [
"https://github.com/s-yata/darts-clone/archive/e40ce4627526985a7767444b6ed6893ab6ff8983.zip",
],
)
# ABSL on 2023-10-18
http_archive(
name = "com_google_absl",
sha256 = "f841f78243f179326f2a80b719f2887c38fe226d288ecdc46e2aa091e6aa43bc",
strip_prefix = "abseil-cpp-9687a8ea750bfcddf790372093245a1d041b21a3",
urls = ["https://github.com/abseil/abseil-cpp/archive//9687a8ea750bfcddf790372093245a1d041b21a3.tar.gz"],
)
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,14 @@ cmake --build --preset windows -j [number of parallel threads to use]

If the build is successful, you should now have a `gemma.exe` executable in the `build/` directory.

#### Bazel

```sh
bazel build -c opt --cxxopt=-std=c++20 :gemma
```

If the build is successful, you should now have a `gemma` executable in the `bazel-bin/` directory.

### Step 4: Run

You can now run `gemma` from inside the `build/` directory.
Expand Down
24 changes: 2 additions & 22 deletions WORKSPACE
Original file line number Diff line number Diff line change
@@ -1,24 +1,4 @@
workspace(name = "gemma")

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
load("@bazel_tools//tools/build_defs/repo:utils.bzl", "maybe")

maybe(
http_archive,
name = "rules_license",
sha256 = "4531deccb913639c30e5c7512a054d5d875698daeb75d8cf90f284375fe7c360",
urls = [
"https://github.com/bazelbuild/rules_license/releases/download/0.0.7/rules_license-0.0.7.tar.gz",
],
)

maybe(
http_archive,
name = "com_google_sentencepiece",
sha256 = "8409b0126ebd62b256c685d5757150cf7fcb2b92a2f2b98efb3f38fc36719754",
strip_prefix = "sentencepiece-0.1.96",
urls = ["https://github.com/google/sentencepiece/archive/refs/tags/v0.1.96.zip"],
build_file = "@//third_party:sentencepiece.bazel",
patches = ["@//third_party:com_google_sentencepiece.patch"],
patch_args = ["-p1"],
)
# This file marks the root of the Bazel workspace.
# See MODULE.bazel for external dependencies setup.
4 changes: 4 additions & 0 deletions bazel/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
package(
default_applicable_licenses = ["//:license"],
default_visibility = ["//:__subpackages__"],
)
Loading