Gitlab Community Edition Instance
Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
W
workshop-forests-in-hpc
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Deploy
Releases
Package registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
AG-Compute-public
workshop-forests-in-hpc
Compare revisions
b3b488908d96f53d1b7f687979e858be0a803c27 to b2321744bcfc66cab277542cd1a56776f6859ff7
Compare revisions
Changes are shown as if the
source
revision was being merged into the
target
revision.
Learn more about comparing revisions.
Source
hpc-team-public/workshop-forests-in-hpc
Select target project
No results found
b2321744bcfc66cab277542cd1a56776f6859ff7
Select Git revision
Branches
main
1 result
Swap
Target
hpc-team-public/workshop-forests-in-hpc
Select target project
hpc-team-public/workshop-forests-in-hpc
1 result
b3b488908d96f53d1b7f687979e858be0a803c27
Select Git revision
Branches
main
1 result
Show changes
Only incoming changes from source
Include changes to target since source was created
Compare
Commits on Source (3)
add debug memory fn
· cb83eefc
Dorothea Sommer
authored
2 years ago
cb83eefc
add more driver information in script
· 1a6e0271
Dorothea Sommer
authored
2 years ago
1a6e0271
replace bmm with matmul due to driver issues
· b2321744
Dorothea Sommer
authored
2 years ago
b2321744
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
Pointnet_Example/model.py
+4
-4
4 additions, 4 deletions
Pointnet_Example/model.py
Pointnet_Example/submit_train.sh
+6
-3
6 additions, 3 deletions
Pointnet_Example/submit_train.sh
Pointnet_Example/utils.py
+8
-0
8 additions, 0 deletions
Pointnet_Example/utils.py
with
18 additions
and
7 deletions
Pointnet_Example/model.py
View file @
b2321744
...
...
@@ -73,13 +73,13 @@ class PointNet(nn.Module):
def
forward
(
self
,
input
):
matrix3x3
=
self
.
input_transform
(
input
)
# batch matrix multiplication
xb
=
torch
.
bmm
(
torch
.
transpose
(
input
,
1
,
2
),
xb
=
torch
.
matmul
(
torch
.
transpose
(
input
,
1
,
2
),
matrix3x3
).
transpose
(
1
,
2
)
xb
=
F
.
relu
(
self
.
bn1
(
self
.
conv1
(
xb
)))
matrix64x64
=
self
.
feature_transform
(
xb
)
xb
=
torch
.
bmm
(
torch
.
transpose
(
xb
,
1
,
2
),
xb
=
torch
.
matmul
(
torch
.
transpose
(
xb
,
1
,
2
),
matrix64x64
).
transpose
(
1
,
2
)
xb
=
F
.
relu
(
self
.
bn2
(
self
.
conv2
(
xb
)))
...
...
@@ -104,8 +104,8 @@ def pointnetloss(outputs, labels, m3x3, m64x64, alpha=0.001, device=None):
# Calculate difference to identity matrix for regularization.
id3x3
=
torch
.
eye
(
3
,
requires_grad
=
True
,
device
=
device
).
repeat
(
bs
,
1
,
1
)
id64x64
=
torch
.
eye
(
64
,
requires_grad
=
True
,
device
=
device
).
repeat
(
bs
,
1
,
1
)
diff3x3
=
id3x3
-
torch
.
bmm
(
m3x3
,
m3x3
.
transpose
(
1
,
2
))
diff64x64
=
id64x64
-
torch
.
bmm
(
m64x64
,
m64x64
.
transpose
(
1
,
2
))
diff3x3
=
id3x3
-
torch
.
matmul
(
m3x3
,
m3x3
.
transpose
(
1
,
2
))
diff64x64
=
id64x64
-
torch
.
matmul
(
m64x64
,
m64x64
.
transpose
(
1
,
2
))
# Negative log likelihood criterion is already adapted to batch size.
return
criterion
(
outputs
,
labels
)
+
alpha
*
(
torch
.
norm
(
diff3x3
)
+
torch
.
norm
(
diff64x64
))
/
float
(
bs
)
...
...
This diff is collapsed.
Click to expand it.
Pointnet_Example/submit_train.sh
View file @
b2321744
#!/bin/bash
#SBATCH --job-name=train-forest-script
#SBATCH -p gpu # request gpu node for the training
#SBATCH -t 00:
0
5:00 # TODO: estimate the time you will need
#SBATCH -G
r
tx
500
0 # requesting specific GPU, run sinfo -p gpu --format=%N,%G # to see what is available
#SBATCH -t 00:
1
5:00 # TODO: estimate the time you will need
#SBATCH -G
g
tx
108
0 # requesting specific GPU, run sinfo -p gpu --format=%N,%G # to see what is available
#SBATCH --nodes=1 # total number of nodes
#SBATCH --ntasks=1 # total number of tasks
#SBATCH --mail-type=begin # send mail when job begins
...
...
@@ -21,7 +21,10 @@ echo "Home directory: ${HOME}"
echo
"Working directory:
$PWD
"
echo
"Current node:
${
SLURM_NODELIST
}
"
# For debugging purposes.
python
--version
python
-m
torch.utils.collect_env
nvcc
-V
# Run the script.
python train.py
\ No newline at end of file
python
-u
train.py
\ No newline at end of file
This diff is collapsed.
Click to expand it.
Pointnet_Example/utils.py
View file @
b2321744
...
...
@@ -174,3 +174,11 @@ def remove_empty_las_files(start_dir: str, verbose: bool = True) -> None:
os
.
remove
(
full_file_name
)
if
verbose
:
print
(
f
"
Remove
{
full_file_name
}
.
"
)
def
print_memory
():
"""
Print the GPU memory. Use for debugging.
"""
total
=
torch
.
cuda
.
get_device_properties
(
0
).
total_memory
reserved
=
torch
.
cuda
.
memory_reserved
(
0
)
allocated
=
torch
.
cuda
.
memory_allocated
(
0
)
free
=
reserved
-
allocated
print
(
f
"
Memory: Total
{
total
}
| Reserved
{
reserved
}
| Allocated
{
allocated
}
| Free memory
{
free
}
"
)
\ No newline at end of file
This diff is collapsed.
Click to expand it.